Phishing Prevention26 min read0 views

Voice Phishing (Vishing): How to Recognize and Prevent Phone Scams

A technical examination of voice phishing (vishing) attacks, covering caller-ID spoofing, AI voice cloning, real-world attack patterns, recognition techniques, and the organisational controls that prevent phone-based social engineering from compromising credentials, financial assets, and sensitive data.

Adebisi Oluwasoya

Adebisi Oluwasoya

Senior Security Analyst · May 5, 2026

Voice Phishing (Vishing): How to Recognize and Prevent Phone Scams

Key Takeaways

  • Vishing attacks bypass email security entirely, targeting the phone channel where employees have fewer technical defences and stronger trust assumptions, resulting in success rates 3x higher than email phishing.
  • AI voice-cloning technology can produce a convincing deepfake of any voice from as little as 3-10 seconds of sample audio, enabling attackers to impersonate CEOs, IT staff, or family members in real-time phone calls.
  • Caller-ID spoofing is trivially easy using VoIP services and SIP manipulation, making the displayed phone number completely unreliable as an identity verification mechanism.
  • The STIR/SHAKEN framework provides cryptographic caller-ID attestation, but adoption gaps and international-call limitations mean it cannot be relied upon as a sole defence against spoofed numbers.
  • The most effective vishing defence is a call-back verification policy: never act on sensitive requests received via inbound calls; instead, hang up and call back on a number obtained independently from a trusted source.

Voice phishing — vishing — is social engineering conducted through the telephone channel. While email phishing dominates security awareness discussions, vishing exploits a channel where humans are psychologically more compliant, technically less protected, and culturally conditioned to respond. When the phone rings and someone identifies themselves as your bank's fraud department, your IT help desk, or your CEO, the instinct is to engage, not to verify.

Vishing predates email phishing by decades. Telephone fraud has existed since the phone system itself. But two technological developments have transformed vishing from a manual, low-scale attack into an industrial-grade threat: VoIP-based caller-ID spoofing (which makes any phone number appear on the recipient's screen) and AI voice cloning (which makes any voice reproducible from a few seconds of audio). Together, these technologies enable real-time, convincing impersonation of any person to any target at negligible cost.

How Vishing Works: The Technical Stack

Step 1 — Caller-ID Spoofing

Caller ID was designed as a convenience feature, not a security mechanism. The phone network transmits caller-ID information using the Calling Party Number (CPN) field in the call-setup signalling. In traditional PSTN networks, the originating carrier sets the CPN. In VoIP networks, the SIP protocol allows the caller to set the "From" header to any value.

This architectural weakness means that the number displayed on your phone's screen can be set to anything:

  • Your company's main switchboard number
  • Your bank's customer-service line
  • A government agency (IRS, HMRC, police)
  • Your CEO's direct mobile number
  • Your own phone number (to create confusion)

Spoofing services are commercially available. Some require no more than signing up for a VoIP provider and configuring the outbound caller-ID field. Dedicated spoofing services (now illegal in many jurisdictions but still operational) charge as little as $0.01 per minute for calls with arbitrary caller IDs.

Step 2 — AI Voice Cloning

Voice cloning has moved from research labs to consumer products. The technology works by training a neural network on audio samples of the target voice to learn the speaker's vocal characteristics (pitch, cadence, formant frequencies, accent, emotional inflection). Once trained, the model can generate new speech in the target voice from any text input.

Key capabilities in 2026:

  • Sample efficiency — convincing clones from 3-10 seconds of audio. High-fidelity clones from 1-5 minutes.
  • Real-time synthesis — the attacker speaks into a microphone and the output is transformed into the target voice in real-time, with latency under 200ms. This enables live conversations, not just pre-recorded playback.
  • Emotional control — the attacker can adjust the emotional tone (urgency, anger, calm, concern) of the synthesised voice independently of their own vocal delivery.
  • Multilingual capability — voice clones can generate speech in languages the original speaker never spoke, using the target's voice characteristics.

Sample Audio Sources

Attackers obtain voice samples from sources the target may never consider sensitive:

  • Earnings calls and investor presentations (publicly archived)
  • Conference talks and panel discussions (YouTube, Vimeo)
  • Podcast interviews and media appearances
  • Voicemail greetings (call the target's direct number after hours)
  • Social media stories and video posts
  • Webinar recordings and training videos

Step 3 — Pretext and Social Engineering

With a spoofed number and a cloned voice (or a convincing human actor), the attacker executes the social-engineering script. Vishing pretexts are consistently effective because the phone channel activates different psychological responses than email:

  • Immediacy — a phone call demands real-time response. There is no time to consult a colleague, check a policy, or think carefully.
  • Emotional engagement — voice conveys urgency, fear, authority, and empathy more effectively than text. An urgent voice triggers emotional compliance.
  • Social pressure — hanging up on someone feels rude. People stay on calls they would immediately delete as emails.
  • Reduced verification — email can be forwarded to IT for analysis. Phone calls are ephemeral; there is no "forward this call to security" button.
Vishing Attack: The Technical Stack Three layers combine to create convincing phone-based social engineering Caller-ID Spoofing VoIP SIP manipulation Any number displayed Cost: < $0.01/minute Victim sees bank, IT, or CEO number + AI Voice Cloning 3-10 sec sample needed Real-time synthesis < 200ms latency Clones from earnings calls, YouTube, voicemail + Social Engineering Real-time pressure Authority exploitation Urgency + fear No time to verify, hanging up feels rude Credential Theft | Wire Transfer Fraud | MFA Bypass Success rate: 3x higher than email phishing Primary Defence: Hang up. Call back on a known number. Never act on sensitive requests from inbound calls. Verify independently.
Figure 1 — The three-layer vishing attack stack. All three components are commercially available and combine to produce phone calls that are indistinguishable from legitimate ones.

Common Vishing Scenarios

Scenario 1 — IT Help Desk Impersonation

The attacker calls an employee, spoofing the company's IT help desk number. The pretext: "We have detected unusual activity on your account and need to verify your identity. Can you please provide your employee ID and the code from your authenticator app?" The employee provides the MFA code, which the attacker uses in real-time to complete a login to the employee's account.

This scenario is devastating because it bypasses MFA without any technical exploit. The employee voluntarily provides the one-time code, and the attacker uses it within its validity window (typically 30-60 seconds for TOTP codes).

Scenario 2 — Bank Fraud Department

The victim's caller ID shows their bank's legitimate phone number. The caller identifies themselves as the fraud department: "We have detected a suspicious transaction on your account. For security, we need to verify your identity. Can you confirm the last four digits of your card number and the one-time code we just sent to your phone?" The "code" is actually a password-reset or transaction-authorisation code triggered by the attacker.

Scenario 3 — CEO Wire Transfer (Vishing Whaling)

The CFO receives a call from what appears to be the CEO's mobile number. The voice is indistinguishable from the CEO's (cloned from an earnings call). The CEO claims to be travelling, mentions a confidential acquisition, and requests an urgent wire transfer. The CFO, recognising the voice and number, initiates the transfer. This scenario has produced losses exceeding $35 million per incident.

Scenario 4 — Government Agency Impersonation

Attackers impersonate the tax authority (IRS, HMRC), law enforcement, or immigration services. The pretext involves immediate legal consequences: "You have an outstanding tax liability. If you do not pay immediately, a warrant will be issued for your arrest." Fear and urgency override rational judgment, particularly for individuals unfamiliar with how government agencies actually operate.

Scenario 5 — Hybrid Vishing (Email + Phone)

The most sophisticated attacks combine channels. The victim receives a legitimate-looking email about an account issue, followed by a phone call from someone claiming to be the sender of the email. The email legitimises the call: "I sent you an email earlier about the security update. Let me walk you through it." The phone call then directs the victim to a phishing page or extracts credentials directly.

STIR/SHAKEN: Technical Countermeasures for Caller-ID Spoofing

The STIR/SHAKEN framework (Secure Telephone Identity Revisited / Signature-based Handling of Asserted information using toKENs) is the telecommunications industry's answer to caller-ID spoofing. It provides cryptographic attestation of caller identity.

How It Works

  1. The originating carrier creates a digital signature (using a certificate from a trusted Certificate Authority) that attests to the calling number. The attestation has three levels:
    • A (Full) — the carrier has authenticated the caller and confirmed they are authorised to use the number
    • B (Partial) — the carrier has authenticated the caller but cannot confirm authorisation for the specific number
    • C (Gateway) — the carrier received the call from an international gateway or another network and cannot verify the origin
  2. The terminating carrier verifies the digital signature. If the signature is valid and the attestation level is A, the call is marked as verified. If the signature is missing, invalid, or low-attestation, the carrier may flag or block the call.

Limitations

  • International calls — STIR/SHAKEN does not cover calls originating from international carriers that have not implemented the framework. Most vishing calls originate from countries without STIR/SHAKEN implementation.
  • Landline and legacy networks — traditional PSTN infrastructure cannot carry STIR/SHAKEN signatures. Calls from legacy networks receive C-level attestation at best.
  • Adoption gaps — while US carriers are required to implement STIR/SHAKEN (FCC mandate), enforcement of call blocking based on attestation levels is inconsistent.
  • Caller-ID display — many consumer phones do not display attestation level, so users cannot see whether a call's caller-ID has been cryptographically verified or not.

STIR/SHAKEN is a meaningful improvement but is not sufficient as a sole defence. It reduces domestic caller-ID spoofing but does not eliminate international or VoIP-originated spoofing.

Recognition Techniques for Individuals

Red Flags in Vishing Calls

  1. Unsolicited inbound calls requesting sensitive information — legitimate organisations almost never call you to request passwords, MFA codes, account numbers, or payment. They already have your information.
  2. Urgency and time pressure — "You must act now or your account will be locked / a warrant will be issued / you will lose access." Legitimate callers can wait.
  3. Requests for one-time codes — no legitimate organisation will ask you to read back an MFA code, SMS verification code, or one-time password received on your phone.
  4. Threats of legal consequences — government agencies do not call to threaten arrest. Tax authorities send written notices.
  5. Refusal to provide a call-back number — if the caller refuses to let you hang up and call back, it is almost certainly a scam.
  6. Unusual audio quality — AI voice clones occasionally produce artifacts: unnatural pauses, slight robotic quality, or inconsistent background noise.
  7. Request to install software — "I need you to install TeamViewer / AnyDesk so I can help you" is a remote-access trojan delivery mechanism.

The Call-Back Verification Protocol

The single most effective defence against vishing is simple:

  1. Receive the call. Do not provide information.
  2. Tell the caller: "Thank you for contacting me. I will call you back to verify." Note the claimed name and department.
  3. Hang up.
  4. Find the legitimate phone number independently (from the official website, your employee directory, or a known contact). Never use a number provided by the caller.
  5. Call the legitimate number and ask for the person who called.

If the call was legitimate, the caller will be there. If it was vishing, you have prevented the compromise with zero risk. This protocol must be trained until it becomes reflexive.

Organisational Defences

Technical Controls

  • Enterprise voice-security platforms — solutions like Pindrop, Hiya, and Mutare analyse call metadata, audio patterns, and behavioural signals to detect spoofed calls, AI-generated audio, and known scam patterns before the call reaches the employee.
  • STIR/SHAKEN verification — ensure your telephony provider implements STIR/SHAKEN and configure your PBX to display attestation levels where possible.
  • Call-recording and analytics — for customer-facing roles (support, finance), record calls and apply AI analytics to detect social-engineering patterns in real-time.
  • Internal caller verification — implement a verbal authentication mechanism for sensitive internal calls. A pre-shared verbal code or challenge-response phrase confirms the caller's identity independent of caller-ID.

Process Controls

  • No sensitive actions on inbound calls — implement a blanket policy: no password resets, no MFA code sharing, no wire transfers, and no access grants based on inbound calls alone. All such requests must be verified through an independent channel.
  • Wire transfer verbal confirmation — require a verbal call-back to a known number for all wire transfers above a defined threshold. The call-back must reach the authorising executive, not an assistant or voicemail.
  • Escalation procedures — define a clear process for employees who receive suspicious calls: hang up, report to security, and log the incident (caller-ID, time, pretext, claimed identity).

Training Controls

  • Vishing simulations — hire a social-engineering firm or use in-house red-team resources to conduct simulated vishing calls against employees. Test finance, IT, and executive assistant roles specifically.
  • Scenario-based training — train employees using the common scenarios (IT help desk, bank fraud, CEO wire transfer) with audio examples that demonstrate how convincing modern vishing calls sound.
  • AI voice awareness — play examples of AI-cloned voices alongside original recordings so employees understand that voice recognition is no longer a reliable identity-verification method.
Vishing Defence Framework Technical Controls STIR/SHAKEN Voice security platform Call recording + AI Verbal auth codes VoIP fraud detection Reduces spoofing success but cannot eliminate it Process Controls No sensitive actions on inbound calls Wire transfer call-back Escalation procedures Incident logging Strongest control layer breaks the attack chain Training Controls Vishing simulations Scenario-based drills AI voice demos Red-flag checklists Call-back habit building Builds reflexive verification behaviour All three layers required. No single control is sufficient against AI-enhanced vishing.
Figure 2 — Vishing defence framework. Process controls (particularly the call-back verification policy) are the strongest layer because they address the attack at its core: the victim's decision to comply.

Responding to Vishing Incidents

If an Employee Falls for a Vishing Call

  1. Credential compromise — immediately reset the employee's password and revoke all active sessions. If MFA was bypassed, issue a new MFA token or re-enrol the device. Check for mailbox-rule changes (forwarding, deletion).
  2. Financial compromise — contact the bank immediately to initiate a wire recall. File a report with law enforcement (FBI IC3 in the US, Action Fraud in the UK) within 24 hours for the best chance of fund recovery.
  3. Remote-access compromise — if the employee installed remote-access software (TeamViewer, AnyDesk), isolate the device from the network, image the disk for forensic analysis, and treat it as a fully compromised endpoint.
  4. Scope assessment — determine what information was disclosed and what systems the attacker may have accessed as a result. Review access logs for the compromised account.

Forensic Evidence Collection

  • Caller-ID number — record the spoofed number. While it may be fake, it provides an indicator of the attacker's infrastructure.
  • Call time and duration — retrieve from PBX or mobile-carrier records
  • Pretext details — document exactly what the caller said, what they requested, and what information was provided
  • Call recording — if available, preserve the recording for AI voice-analysis (to determine if voice cloning was used)

Vishing exploits the persistent human assumption that a phone call is more trustworthy than an email. This assumption was reasonable in the PSTN era when caller-ID was difficult to spoof and voice cloning was science fiction. In 2026, neither condition holds. Organisations that train employees to treat inbound phone calls with the same suspicion as inbound emails, and that implement call-back verification as a reflexive policy, will resist the vishing attacks that bypass every email security control in their stack.

Frequently Asked Questions

Caller-ID spoofing is straightforward using VoIP (Voice over IP) services. The SIP (Session Initiation Protocol) used by VoIP allows the caller to set any value in the "From" header, which is displayed as the caller ID on the recipient phone. Many legitimate VoIP providers offer caller-ID configuration as a standard feature (for businesses to display their main number). Attackers use offshore VoIP providers or self-hosted Asterisk PBX systems to set arbitrary caller IDs, including the target company main number, a bank phone number, or a government agency. The cost is negligible, typically less than one cent per minute for international VoIP calls.

Adebisi Oluwasoya

Adebisi Oluwasoya

Senior Security Analyst

Threat Intelligence & IR

Adebisi is a CISSP-certified cybersecurity analyst with over eight years of experience in enterprise security. He specializes in threat intelligence and incident response, helping organizations detect, analyze, and neutralize advanced persistent threats. His work spans Fortune 500 companies across the financial, healthcare, and government sectors.

You Might Also Like

Free Newsletter

Stay Ahead of Cyber Threats

Get weekly cybersecurity insights and practical tips. No spam, just actionable advice to keep you safe.