One of the key factors in building a semantic core is the proper exclusion of irrelevant keywords. Below is a detailed explanation of how I perform keyword exclusion during semantic core development. This method is also useful when preparing keyword lists for Yandex Direct campaigns.
Removing Unwanted Keywords in Key Collector 4
Recently, Key Collector has become less reliable for parsing queries, but some of its other features still function properly — one of them being keyword exclusion. To filter out search queries containing words unsuitable for your semantic core or advertising campaigns, select all relevant queries, right-click, and choose «Send selected phrases to negative keywords».

In the opened window, select “Split into words”.

Sort the resulting list by the frequency of each word across the semantic core.

Check the boxes next to the queries that should be added to the list of excluded phrases.

After selecting all exclusion words, click “Add” at the bottom of the window.


Open the «Excluded Words» window from the top menu.

Select all queries and click “Show found words”.

After this, Key Collector will display all queries that contain excluded words, allowing you to review and remove them.

Excluding Words Using Overlead.me
Another tool that currently offers free keyword exclusion functionality is Overlead.me. Register an account and open the “Excluded Words” section.

Click “New Query”.

In the opened window, enter a project name and paste all your queries into the “Keys” field, then click Create.

Once processing is complete, you’ll receive a notification by email. After that, you can proceed with keyword analysis.

In the project workspace, open the “Excluded Words” tab.

Select “Show 1000 entries”.

Begin marking unwanted words by checking the corresponding boxes.

If you’re unsure where a word is used, click the “+” symbol to view a list of queries containing it.

After marking all necessary words, click «Apply» — you’ll receive a complete list of excluded phrases.

Extracting Excluded Phrases Using ChatGPT
For this method, access to ChatGPT-4 or higher is required. Use the following prompt:
Process this Excel file, extract all phrases, convert each word to its base form (lemmatization), and count the frequency of each word.
Ensure that words with different endings but the same meaning are treated as one word.
For example, ‘bank,’ ‘banks,’ ‘bank’s,’ and ‘banking’ should all be recognized as the same base word ‘bank’.
Use an extended lemmatization dictionary to cover more word forms.
Output the results as a table with the following columns:
'Word' — the word in its base form
'Frequency' — how often the word appears
'Queries containing the word'
After processing, you will receive a file containing all queries with a complete list of excluded phrases.

These are the primary methods for identifying excluded phrases, which help significantly reduce the amount of irrelevant data and optimize the process of building a semantic core.