Google Advanced Search Evolution
Google’s advanced search has changed in three major ways over the last few years:
- It has become far more AI-driven and conversational.
- Many traditional operator-based workflows are less visible or less reliable.
- Search results are increasingly synthesised instead of simply ranked.
Here’s the practical breakdown.
- Google moved from “query matching” to “answer generation”
Historically, advanced search meant:
- Boolean operators (AND, OR, -)
- Exact-match quotes
- site:
- filetype:
- intitle:
- date filtering
- highly structured keyword queries
Google still supports many of these, but the core UX is no longer centred around precision operators. Instead, Google now pushes:
- natural-language questions
- follow-up conversational search
- multimodal search (voice, image, screenshots)
- AI-generated summaries (“AI Overviews”)
- “AI Mode” conversational results
Example: Old-style advanced query: site:gov.uk “money laundering” filetype:pdf
New-style Google behaviour: What are the current UK anti-money laundering guidance documents for regulated entities? Google increasingly tries to infer:
- intent
- context
- entity relationships
- authority
- likely next questions
instead of strictly obeying keyword syntax.
- AI Overviews changed the structure of search results
Google now frequently inserts AI-generated summaries at the top of results pages. These:
- synthesise information from multiple sources
- answer informational queries directly
- reduce the need to click on websites
- often appear before traditional blue links
This fundamentally changes advanced search because:
- users receive interpreted answers instead of raw source lists
- source selection is less transparent
- ranking signals are no longer the whole story
Research in 2026 found:
- AI Overviews appear for a large percentage of informational searches
- cited sources can differ substantially from standard organic rankings
- responses can vary between runs of the same query
So advanced searching is shifting from: “find documents” to: “Interrogate a synthesis engine”
- Some classic operators are weaker or inconsistently supported
Many advanced operators still work:
- site:
- filetype:
- quotes ” “
- -keyword
- OR
- before: / after:
But Google has gradually reduced emphasis on:
- highly granular operator documentation
- exact matching
- predictable Boolean logic
Certain legacy operators have been deprecated or have become unreliable over time, including:
- link:
- info:
- some uses of inanchor:
- precise wildcard behaviour
Modern Google prioritises semantic relevance over literal operator precision.
- Search became multimodal
Advanced search now includes:
- image search via Google Lens
- “Circle to Search”
- voice queries
- screenshot-based lookup
- camera-driven search workflows
Google reports massive growth in visual search usage. That means “advanced search” increasingly includes:
- visual context
- object recognition
- OCR
- scene understanding
- geospatial inference
- “Web-only” search is now a niche workflow
A growing number of users deliberately bypass AI-enhanced results using:
- Google “Web” tab
- udm=14 URL parameter
This strips out:
- AI Overviews
- shopping modules
- many rich SERP features
and restores a more traditional link-based experience.
Example: https://www.google.com/search?q=osint&udm=14
This has become popular among:
- OSINT practitioners
- researchers
- journalists
- technical users
- archivists
who want less interpretation and more direct visibility into the source.
- SEO and discoverability changed dramatically
Google’s newer search systems increasingly reward:
- authority
- experience-led content
- structured data
- citations
- topical trust
- entity relationships
rather than pure keyword optimisation. For advanced search users, this means:
- fewer obscure niche sources surface organically
- high-authority domains dominate AI summaries
- long-tail discoverability is reduced in some contexts
- Advanced search is now partly “prompt engineering”
Experienced users increasingly search using:
- long natural-language prompts
- layered contextual questions
- iterative refinement
- conversational follow-ups
Example progression: Find Isle of Man AML regulations for TCSPs
then: Only official regulatory guidance
then: Compare with FATF Recommendation 10
This resembles interacting with an LLM more than with classic search engine syntax.
What has not changed
Traditional advanced operators remain extremely useful for:
- OSINT
- compliance research
- legal/regulatory discovery
- academic work
- cyber investigations
- metadata hunting
Especially:
site:
filetype:
“exact phrase”
before:
after:
These are still among the highest-value precision tools in Google Search.
Net effect
Google’s advanced search has evolved from:
| Old Google | Current Google |
| Keyword retrieval | AI-assisted synthesis |
| Exact operators | Intent interpretation |
| Blue links | AI summaries + blended SERPs |
| Precision queries | Conversational prompting |
| Text-centric | Multimodal |
| User evaluates sources | Google pre-interprets sources |
For technical users, investigators, analysts and researchers, the biggest adaptation is:
- combining classic operators with conversational querying
- explicitly forcing web-only results when necessary
- validating AI summaries against primary sources rather than trusting synthesised output unquestioningly.
Quotes (” “) and asterisks (*) used to be central to Google’s advanced search precision, but their behaviour has changed noticeably as Google shifted toward semantic interpretation and AI-assisted retrieval.
Quotes “exact phrase”
Historically:
- Quotes forced near-literal phrase matching.
- Google returned pages containing the exact word sequence.
Example: “enhanced due diligence”
Originally, this meant:
- all three words
- in that order
- adjacent or nearly adjacent
What changed
Google still treats quotes as a strong precision signal, but:
- semantic expansion sometimes still occurs
- stemming/pluralisation can leak in
- Snippets may highlight non-exact matches
- AI Overviews may summarise beyond the exact quoted text
However, quotes remain one of the most reliable operators for:
- legal text
- OSINT
- plagiarism checking
- identifying copied content
- finding exact document wording
- verifying whether a phrase actually exists online
Quotes are now relatively more valuable.
Because Google has become more interpretive overall, exact quotes are now one of the few remaining ways to achieve higher precision.
Example: site:iomfsa.im “source of wealth”
This is still highly effective for regulatory discovery.
Asterisk *
The asterisk operator changed much more dramatically.
Original behaviour
Historically: “Tony * Bennett” would act roughly like:
wildcard for one or more unknown words.
Google could match:
- Tony J Bennett
- Tony Alan Bennett
- Tony and Bennett
Similarly: “money laundering * regulations”
might match:
- money laundering reporting regulations
- money laundering prevention regulations
This was extremely useful for:
- partial quotations
- song lyrics
- fragmented text recovery
- OSINT
- document reconstruction
Current behaviour of *
Google no longer documents * as a true wildcard operator in the same way. Today, it behaves inconsistently:
- sometimes ignored
- sometimes treated as a loose placeholder
- sometimes semantically expanded
- often less deterministic than before
In practice:
- It still occasionally works inside quoted phrases
- It is much less reliable for precision searching
- semantic search often overrides literal wildcard intent
Example: “the * fox jumps”
may return:
- exact wildcard-like substitutions
- semantically related phrases
- approximate matches
instead of strict positional substitution.
Modern replacement for wildcard workflows
Advanced users now often use:
- partial quotes
- OR logic
- multiple searches
- AI prompting
- regex-capable external search engines
- specialised OSINT tools
instead of relying on Google * .
Example modern workflow: Instead of: “money laundering * regulations”
users now do:
“money laundering regulations” OR “money laundering reporting regulations” OR: site:gov.uk “money laundering”
then refine iteratively.
Important nuance: Google now “understands” missing words
Because modern Google uses entity understanding and embeddings:
- it often infers omitted words automatically
- wildcard precision matters less to Google internally
- but matters more to investigators/researchers
This is one reason the explicit * operator lost importance.
Google assumes: “I know what the user probably means.”
Whereas traditional advanced search users often want: “Return only literal structures.”
That philosophical shift is a major change in search behaviour.
Current best practices
Use quotes for:
- exact wording
- legal/regulatory language
- leaked-text validation
- document tracing
- name disambiguation
- OSINT verification
Example: “beneficial ownership register”
Use * only experimentally
It can still help occasionally for:
- forgotten song lyrics
- partial quotations
- fragmented text
Example: “to be * not to be”
But do not rely on it operationally.
Better modern alternatives
For precision: “exact phrase” site:domain.com
For variation handling: (“EDD” OR “enhanced due diligence”)
For date scoping: after:2024 before:2026
For investigative work:
- Google + alternative engines
- OCR tools
- vector search
- specialized databases
- AI-assisted summarisation
- direct site search
Bottom line
| Operator | Old Google | Current Google |
| “quotes” | Strong literal match | Still strong, but within semantic search |
| * wildcard | Useful positional wildcard | Inconsistent and de-emphasised |
| Search philosophy | Literal retrieval | Intent inference |
Quotes remain essential for advanced users.
The wildcard operator is now more of a legacy artefact than a dependable precision tool.
Regex-capable external search engines
Regex-capable or regex-adjacent search tools are typically outside the mainstream of web search engines like Google Search because Google intentionally abstracts away low-level pattern matching. For OSINT, DFIR, compliance, cyber investigations, research and archival work, practitioners instead use platforms that support:
- regular expressions directly
- Lucene syntax
- grep-like pattern matching
- advanced indexing/query DSLs
- proximity/wildcard/fuzzy logic beyond standard search engines
Here are the major categories:
True regex-capable search tools
https://grep.app/
Searches public GitHub repositories with regex support.
Excellent for:
- secrets hunting
- malware research
- API key discovery
- code intelligence
- infrastructure fingerprinting
Example: AKIA[0-9A-Z]{16}
Finds exposed AWS access keys.
Another: password\s*=\s*[“‘]
Finds hardcoded password assignments.
https://sourcegraph.com (Paid)
Enterprise-grade code search with strong regex support.
Supports:
- regex
- structural search
- repository scoping
- symbol indexing
Widely used in:
- AppSec
- large engineering environments
- vulnerability research
Not a full regex, but it supports powerful banner matching and filtering.
Useful for:
- exposed services
- ICS/SCADA
- internet-facing infrastructure
- TLS/certificate analysis
Example: product:”nginx”
or:
ssl:”example.com”
Supports:
- fielded searching
- certificate parsing
- host enumeration
- protocol metadata analysis
Closer to query DSL than consumer search.
https://www.elastic.co (Paid)
The backbone of many enterprise investigation systems.
Supports:
- Lucene regex
- wildcard
- proximity
- fuzzy matching
- Boolean logic
Example Lucene regex: /error-[0-9]+/
Common in:
- SIEM
- SOC operations
- log analysis
- compliance monitoring
Regex-adjacent investigative platforms
Not regex-centric, but supports transform-based data correlation.
Used heavily in:
- OSINT
- link analysis
- entity mapping
Often combined with regex extraction workflows.
https://www.intel471.com (Paid)
Can:
- scrape data
- extract indicators
- pattern-match emails/domains/IPs
Useful for automated reconnaissance.
https://github.com/lanmaster53/recon-ng
Often combined with:
- regex pipelines
- scraping
- grep/sed/awk tooling
More technical workflow.
Web search engines with limited advanced syntax
Supports some advanced operators and tends to preserve literal matching better than Google in certain contexts. Still not a true regex.
Power-user-oriented search engine with:
- lenses
- prioritisation
- domain weighting
- cleaner literal handling
Again, not regex, but often preferred by researchers frustrated with Google’s semantic drift.
Archive and dataset search tools
Not regex-native, but useful for:
- historical reconstruction
- deleted content
- comparing snapshots
Often paired with regex locally after export.
Common Crawl
Massive web crawl datasets.
Researchers frequently:
- download indexes
- run regex locally
- perform NLP extraction
Very powerful but technical.
CLI and local regex workflows
For serious investigators, regex search often moves locally.
Core tools include:
- grep
- ripgrep
- awk
- sed
- jq
- yq
Especially: ripgrep
Extremely fast recursive regex search.
Example: rg -i “beneficial owner”
Regex: rg “[A-Z]{2}[0-9]{6}”
Widely used in:
- DFIR
- code auditing
- leaked-data analysis
- SOC workflows
Why Google moved away from regex-style search
Google optimised for:
- consumer usability
- intent inference
- AI summarisation
- semantic embeddings
- conversational search
Regex conflicts with that model because regex assumes:
- deterministic retrieval
- literal structure
- syntax discipline
Google now prioritises:
- probabilistic relevance
- inferred meaning
- blended answers
That makes regex-style workflows increasingly external to mainstream search.
Practical OSINT stack today
A modern investigator might combine:
| Task | Tool |
| General discovery | Google Search |
| Literal precision | quotes + site: |
| Infrastructure | Shodan |
| Code leakage | grep.app |
| Historical evidence | Wayback Machine |
| Large-scale extraction | ripgrep/Elasticsearch |
| Entity correlation | Maltego |
Most useful regex-capable tools by discipline
| Discipline | Best Tool |
| Cybersecurity | Shodan |
| Code intelligence | grep.app |
| SOC/SIEM | Elasticsearch |
| OSINT automation | SpiderFoot |
| DFIR/local analysis | ripgrep |
| Link analysis | Maltego |
https://www.google.com/advanced_search
Google Advanced Search is basically a form-based query builder. The boxes convert your entries into Google operators and URL parameters, so you do not have to type them manually.
| Box/dropdown | What it does | Manual equivalent |
| All these words | Searches for pages relevant to all the words you enter. Google may still use stemming, synonyms, and ranking logic. | word1 word2 word3 |
| This exact word or phrase | Searches for the words in that exact order. Best for names, quotes, error messages, document titles, phrases. | “exact phrase” |
| Any of these words | Searches for at least one of the terms. Useful for synonyms, aliases, spellings, or alternatives. | term1 OR term2 OR term3 |
| None of these words | Excludes results containing those words or phrases. | -word or -“excluded phrase” |
| Numbers ranging from | Searches for numbers within a range, often with units, dates, prices, weights, etc. | 10..35 kg, 2019..2024, £500..£900 |
| Language | Restricts results to pages in a selected language. | URL parameter / Google filter, not usually typed manually |
| Region | Prioritises or restricts results associated with a selected country/region. | URL parameter / Google filter |
| Last update | Limits results to pages updated within a period, such as the past 24 hours, week, month, or year. | Tools/date filter; sometimes similar to after: / before: |
| Site or domain | Searches only within a website or top-level domain. Very useful for OSINT. | site:example.com, site:.gov, site:.ac.uk |
| Terms appearing | Restricts where the terms appear: anywhere, title, text, URL, or links to the page. | intitle:, allintitle:, intext:, inurl: |
| File type | Finds specific file formats such as PDF, DOC, XLS, PPT, KML, etc. | filetype:pdf, filetype:xls, filetype:ppt |
| Usage rights | Filters by licence/reuse permissions. Useful for images and reusable content, but check the original source licence. | Google licence filter |
Google’s own Advanced Search page describes the main boxes this way: quotes for exact phrases, OR for alternatives, minus signs for excluded terms, two dots for number ranges, site/domain restriction, location of terms on the page, file-type filtering and usage-rights filtering.
For OSINT, the most useful boxes are usually:
All these words: fraud investigation
Exact phrase: “Tony Bennett”
Any of these words: leak OR breach OR database OR dump
None of these words: music singer
Site or domain: gov.uk
File type: pdf
Terms appearing: in the title of the page
