How do attackers spoof caller ID?

Caller-ID spoofing is straightforward using VoIP (Voice over IP) services. The SIP (Session Initiation Protocol) used by VoIP allows the caller to set any value in the "From" header, which is displayed as the caller ID on the recipient phone. Many legitimate VoIP providers offer caller-ID configuration as a standard feature (for businesses to display their main number). Attackers use offshore VoIP providers or self-hosted Asterisk PBX systems to set arbitrary caller IDs, including the target company main number, a bank phone number, or a government agency. The cost is negligible, typically less than one cent per minute for international VoIP calls.

Can AI really clone someone voice convincingly?

Yes. Modern voice-synthesis models like VALL-E (Microsoft), Tortoise-TTS, and commercial platforms like ElevenLabs can produce highly convincing voice clones from as little as 3-10 seconds of sample audio. Longer samples (1-5 minutes) produce near-indistinguishable clones including tone, cadence, accent, and emotional inflection. Real-time voice cloning allows attackers to have live conversations in the cloned voice, not just play pre-recorded messages. Sample audio can be sourced from earnings calls, conference presentations, YouTube videos, social media, or voicemail greetings. Defence requires establishing out-of-band verification protocols that do not rely on voice recognition.

What should I do if I receive a suspicious call?

Follow the hang-up-and-call-back rule. If someone calls claiming to be from your bank, IT department, a government agency, or a colleague requesting sensitive action: 1) Do not provide any information, even to confirm your identity. 2) Tell the caller you will call them back. 3) Hang up. 4) Find the legitimate phone number independently (from the official website, your contacts, or a verified directory, never from the caller). 5) Call that number and ask to be connected to the person who allegedly called. If it was legitimate, they will confirm. If it was a scam, you have avoided the compromise. Never use a call-back number provided by the suspicious caller.

Voice Phishing (Vishing): How to Recognize and Prevent Phone Scams

Voice phishing — vishing — is social engineering conducted through the telephone channel. While email phishing dominates security awareness discussions, vishing exploits a channel where humans are psychologically more compliant, technically less protected, and culturally conditioned to respond. When the phone rings and someone identifies themselves as your bank's fraud department, your IT help desk, or your CEO, the instinct is to engage, not to verify.

Vishing predates email phishing by decades. Telephone fraud has existed since the phone system itself. But two technological developments have transformed vishing from a manual, low-scale attack into an industrial-grade threat: VoIP-based caller-ID spoofing (which makes any phone number appear on the recipient's screen) and AI voice cloning (which makes any voice reproducible from a few seconds of audio). Together, these technologies enable real-time, convincing impersonation of any person to any target at negligible cost.

How Vishing Works: The Technical Stack

Step 1 — Caller-ID Spoofing

Caller ID was designed as a convenience feature, not a security mechanism. The phone network transmits caller-ID information using the Calling Party Number (CPN) field in the call-setup signalling. In traditional PSTN networks, the originating carrier sets the CPN. In VoIP networks, the SIP protocol allows the caller to set the "From" header to any value.

This architectural weakness means that the number displayed on your phone's screen can be set to anything:

Your company's main switchboard number
Your bank's customer-service line
A government agency (IRS, HMRC, police)
Your CEO's direct mobile number
Your own phone number (to create confusion)

Spoofing services are commercially available. Some require no more than signing up for a VoIP provider and configuring the outbound caller-ID field. Dedicated spoofing services (now illegal in many jurisdictions but still operational) charge as little as $0.01 per minute for calls with arbitrary caller IDs.

Step 2 — AI Voice Cloning

Voice cloning has moved from research labs to consumer products. The technology works by training a neural network on audio samples of the target voice to learn the speaker's vocal characteristics (pitch, cadence, formant frequencies, accent, emotional inflection). Once trained, the model can generate new speech in the target voice from any text input.

Key capabilities in 2026:

Sample efficiency — convincing clones from 3-10 seconds of audio. High-fidelity clones from 1-5 minutes.
Real-time synthesis — the attacker speaks into a microphone and the output is transformed into the target voice in real-time, with latency under 200ms. This enables live conversations, not just pre-recorded playback.
Emotional control — the attacker can adjust the emotional tone (urgency, anger, calm, concern) of the synthesised voice independently of their own vocal delivery.
Multilingual capability — voice clones can generate speech in languages the original speaker never spoke, using the target's voice characteristics.

Sample Audio Sources

Attackers obtain voice samples from sources the target may never consider sensitive:

Earnings calls and investor presentations (publicly archived)
Conference talks and panel discussions (YouTube, Vimeo)
Podcast interviews and media appearances
Voicemail greetings (call the target's direct number after hours)
Social media stories and video posts
Webinar recordings and training videos

With a spoofed number and a cloned voice (or a convincing human actor), the attacker executes the social-engineering script. Vishing pretexts are consistently effective because the phone channel activates different psychological responses than email:

Immediacy — a phone call demands real-time response. There is no time to consult a colleague, check a policy, or think carefully.
Emotional engagement — voice conveys urgency, fear, authority, and empathy more effectively than text. An urgent voice triggers emotional compliance.
Social pressure — hanging up on someone feels rude. People stay on calls they would immediately delete as emails.
Reduced verification — email can be forwarded to IT for analysis. Phone calls are ephemeral; there is no "forward this call to security" button.

Figure 1 — The three-layer vishing attack stack. All three components are commercially available and combine to produce phone calls that are indistinguishable from legitimate ones.

Common Vishing Scenarios

Scenario 1 — IT Help Desk Impersonation

The attacker calls an employee, spoofing the company's IT help desk number. The pretext: "We have detected unusual activity on your account and need to verify your identity. Can you please provide your employee ID and the code from your authenticator app?" The employee provides the MFA code, which the attacker uses in real-time to complete a login to the employee's account.

This scenario is devastating because it bypasses MFA without any technical exploit. The employee voluntarily provides the one-time code, and the attacker uses it within its validity window (typically 30-60 seconds for TOTP codes).

Scenario 2 — Bank Fraud Department

The victim's caller ID shows their bank's legitimate phone number. The caller identifies themselves as the fraud department: "We have detected a suspicious transaction on your account. For security, we need to verify your identity. Can you confirm the last four digits of your card number and the one-time code we just sent to your phone?" The "code" is actually a password-reset or transaction-authorisation code triggered by the attacker.

Scenario 3 — CEO Wire Transfer (Vishing Whaling)

The CFO receives a call from what appears to be the CEO's mobile number. The voice is indistinguishable from the CEO's (cloned from an earnings call). The CEO claims to be travelling, mentions a confidential acquisition, and requests an urgent wire transfer. The CFO, recognising the voice and number, initiates the transfer. This scenario has produced losses exceeding $35 million per incident.

Scenario 4 — Government Agency Impersonation

Attackers impersonate the tax authority (IRS, HMRC), law enforcement, or immigration services. The pretext involves immediate legal consequences: "You have an outstanding tax liability. If you do not pay immediately, a warrant will be issued for your arrest." Fear and urgency override rational judgment, particularly for individuals unfamiliar with how government agencies actually operate.

Scenario 5 — Hybrid Vishing (Email + Phone)

The most sophisticated attacks combine channels. The victim receives a legitimate-looking email about an account issue, followed by a phone call from someone claiming to be the sender of the email. The email legitimises the call: "I sent you an email earlier about the security update. Let me walk you through it." The phone call then directs the victim to a phishing page or extracts credentials directly.

STIR/SHAKEN: Technical Countermeasures for Caller-ID Spoofing

The STIR/SHAKEN framework (Secure Telephone Identity Revisited / Signature-based Handling of Asserted information using toKENs) is the telecommunications industry's answer to caller-ID spoofing. It provides cryptographic attestation of caller identity.

How It Works

The originating carrier creates a digital signature (using a certificate from a trusted Certificate Authority) that attests to the calling number. The attestation has three levels:
- A (Full) — the carrier has authenticated the caller and confirmed they are authorised to use the number
- B (Partial) — the carrier has authenticated the caller but cannot confirm authorisation for the specific number
- C (Gateway) — the carrier received the call from an international gateway or another network and cannot verify the origin
The terminating carrier verifies the digital signature. If the signature is valid and the attestation level is A, the call is marked as verified. If the signature is missing, invalid, or low-attestation, the carrier may flag or block the call.

Limitations

International calls — STIR/SHAKEN does not cover calls originating from international carriers that have not implemented the framework. Most vishing calls originate from countries without STIR/SHAKEN implementation.
Landline and legacy networks — traditional PSTN infrastructure cannot carry STIR/SHAKEN signatures. Calls from legacy networks receive C-level attestation at best.
Adoption gaps — while US carriers are required to implement STIR/SHAKEN (FCC mandate), enforcement of call blocking based on attestation levels is inconsistent.
Caller-ID display — many consumer phones do not display attestation level, so users cannot see whether a call's caller-ID has been cryptographically verified or not.

STIR/SHAKEN is a meaningful improvement but is not sufficient as a sole defence. It reduces domestic caller-ID spoofing but does not eliminate international or VoIP-originated spoofing.

Recognition Techniques for Individuals

Red Flags in Vishing Calls

Unsolicited inbound calls requesting sensitive information — legitimate organisations almost never call you to request passwords, MFA codes, account numbers, or payment. They already have your information.
Urgency and time pressure — "You must act now or your account will be locked / a warrant will be issued / you will lose access." Legitimate callers can wait.
Requests for one-time codes — no legitimate organisation will ask you to read back an MFA code, SMS verification code, or one-time password received on your phone.
Threats of legal consequences — government agencies do not call to threaten arrest. Tax authorities send written notices.
Refusal to provide a call-back number — if the caller refuses to let you hang up and call back, it is almost certainly a scam.
Unusual audio quality — AI voice clones occasionally produce artifacts: unnatural pauses, slight robotic quality, or inconsistent background noise.
Request to install software — "I need you to install TeamViewer / AnyDesk so I can help you" is a remote-access trojan delivery mechanism.

The Call-Back Verification Protocol

The single most effective defence against vishing is simple:

Receive the call. Do not provide information.
Tell the caller: "Thank you for contacting me. I will call you back to verify." Note the claimed name and department.
Hang up.
Find the legitimate phone number independently (from the official website, your employee directory, or a known contact). Never use a number provided by the caller.
Call the legitimate number and ask for the person who called.

If the call was legitimate, the caller will be there. If it was vishing, you have prevented the compromise with zero risk. This protocol must be trained until it becomes reflexive.

Organisational Defences

Technical Controls

Enterprise voice-security platforms — solutions like Pindrop, Hiya, and Mutare analyse call metadata, audio patterns, and behavioural signals to detect spoofed calls, AI-generated audio, and known scam patterns before the call reaches the employee.
STIR/SHAKEN verification — ensure your telephony provider implements STIR/SHAKEN and configure your PBX to display attestation levels where possible.
Call-recording and analytics — for customer-facing roles (support, finance), record calls and apply AI analytics to detect social-engineering patterns in real-time.
Internal caller verification — implement a verbal authentication mechanism for sensitive internal calls. A pre-shared verbal code or challenge-response phrase confirms the caller's identity independent of caller-ID.

Process Controls

No sensitive actions on inbound calls — implement a blanket policy: no password resets, no MFA code sharing, no wire transfers, and no access grants based on inbound calls alone. All such requests must be verified through an independent channel.
Wire transfer verbal confirmation — require a verbal call-back to a known number for all wire transfers above a defined threshold. The call-back must reach the authorising executive, not an assistant or voicemail.
Escalation procedures — define a clear process for employees who receive suspicious calls: hang up, report to security, and log the incident (caller-ID, time, pretext, claimed identity).

Training Controls

Vishing simulations — hire a social-engineering firm or use in-house red-team resources to conduct simulated vishing calls against employees. Test finance, IT, and executive assistant roles specifically.
Scenario-based training — train employees using the common scenarios (IT help desk, bank fraud, CEO wire transfer) with audio examples that demonstrate how convincing modern vishing calls sound.
AI voice awareness — play examples of AI-cloned voices alongside original recordings so employees understand that voice recognition is no longer a reliable identity-verification method.

Figure 2 — Vishing defence framework. Process controls (particularly the call-back verification policy) are the strongest layer because they address the attack at its core: the victim's decision to comply.

Responding to Vishing Incidents

If an Employee Falls for a Vishing Call

Credential compromise — immediately reset the employee's password and revoke all active sessions. If MFA was bypassed, issue a new MFA token or re-enrol the device. Check for mailbox-rule changes (forwarding, deletion).
Financial compromise — contact the bank immediately to initiate a wire recall. File a report with law enforcement (FBI IC3 in the US, Action Fraud in the UK) within 24 hours for the best chance of fund recovery.
Remote-access compromise — if the employee installed remote-access software (TeamViewer, AnyDesk), isolate the device from the network, image the disk for forensic analysis, and treat it as a fully compromised endpoint.
Scope assessment — determine what information was disclosed and what systems the attacker may have accessed as a result. Review access logs for the compromised account.

Forensic Evidence Collection

Caller-ID number — record the spoofed number. While it may be fake, it provides an indicator of the attacker's infrastructure.
Call time and duration — retrieve from PBX or mobile-carrier records
Pretext details — document exactly what the caller said, what they requested, and what information was provided
Call recording — if available, preserve the recording for AI voice-analysis (to determine if voice cloning was used)

Vishing exploits the persistent human assumption that a phone call is more trustworthy than an email. This assumption was reasonable in the PSTN era when caller-ID was difficult to spoof and voice cloning was science fiction. In 2026, neither condition holds. Organisations that train employees to treat inbound phone calls with the same suspicion as inbound emails, and that implement call-back verification as a reflexive policy, will resist the vishing attacks that bypass every email security control in their stack.

Voice Phishing (Vishing): How to Recognize and Prevent Phone Scams

Key Takeaways