How I Map Keywords to Pages Before Writing a Single Blog Post

This is the second post in a series about building a 110-post SEO content strategy from scratch. If you missed the first one, start here for the full overview.

The Problem With Keyword Research Done in Isolation

Most businesses approach keyword research the same way. They find a tool, type in their industry, get a list of terms with search volumes, pick the ones that look promising, and hand them to a writer. The writer produces content. The content gets published. Nothing ranks.

The missing step is not better keywords. It is understanding which page on the website each keyword belongs to and why. A keyword does not exist in a vacuum. It needs a home. And that home needs to be the right type of page for the intent behind the search.

Without that mapping, you end up in one of two bad situations. Either you create blog posts competing against your own service pages for the same keywords, or you create service pages targeting keywords that should be blog content. Both confuse Google and split your ranking potential instead of concentrating it.

The Two Types of Pages That Need Keywords

For the Tiger Tail project, the website had two distinct types of pages before a single blog post was written. Service pages and industry pages. Each type needs its own keyword logic.

Service pages target keywords where the searcher is looking for a solution or a provider. Someone searching “ai strategy consultant” or “workflow automation services” has commercial intent. They are not looking for an explanation. They are looking for someone to hire. These keywords belong on service pages, not blogs.

Industry pages target keywords where the searcher is a specific type of business looking for AI solutions relevant to their sector. Someone searching “ai for law firms” or “ai for real estate agents” has commercial intent too, but with an industry-specific lens. These keywords belong on the industry pages, not the blog either.

Blog posts serve a different purpose. They capture informational searches from people who are not ready to buy yet but are researching the problem. The blog content feeds authority to the service and industry pages. The pages convert. The blog attracts.

The blog content feeds authority to the service and industry pages. The pages convert. The blog attracts.

Service pages and industry pages target buyers. Blog posts target researchers. Mixing them up is one of the most common and most damaging SEO mistakes a business can make.

The Actual Mapping: Real Data From the Project

Here is what the keyword-to-page mapping looked like for the Tiger Tail service pages. Every page got its primary keywords and monthly search volumes confirmed before any content was briefed.

service-page-keyword-map.txt

Page URL                              Primary Keyword                    Monthly Searches

/services/ai-audit-strategy            ai strategy consultant                      880
/services/ai-audit-strategy            ai readiness assessment                     720
/services/ai-audit-strategy            ai implementation consultant                390
/services/ai-audit-strategy            automation consultant                       480
/services/workflow-automation          business process automation services        320
/services/custom-ai-development        custom ai development company               480
/services/custom-ai-development        ai integration services                     590
/services/growth-engineering           ai marketing automation                     720
/services/growth-engineering           ai lead generation agency                   110
/services/ai-training-enablement       corporate ai training                        40

And here is the same mapping for the industry pages:

industry-page-keyword-map.txt

Page URL                              Primary Keyword                    Monthly Searches

/ai-for-legal                          ai for law firms                          1,300
/ai-for-real-estate                    ai real estate agent                        590
/ai-for-real-estate                    ai for real estate agents                   480
/ai-for-healthcare                     healthcare workflow automation              170
/ai-for-finance-accounting             ai for accounting firms                      70
/ai-for-home-services                  ai for contractors                          110
/ai-for-legal                          legal document automation                   170
/ai-for-healthcare                     ai for medical billing                       90

 

Looking at this data together, the legal page stands out immediately. “Ai for law firms” at 1,300 searches per month is the single highest-volume keyword across all pages on the site. That tells you the legal cluster needs serious depth in the blog to give that page the authority it needs to compete.

The corporate AI training page, on the other hand, targets “corporate ai training” at just 40 searches per month. That is a low-volume keyword but the commercial intent behind it is very high. Someone searching that phrase is almost certainly a business ready to spend money on training. Low volume does not mean low value.

How Search Volume Shapes Priority, Not Just Selection

This is the part most keyword guides miss. Search volume is not just a filter for deciding which keywords to target. It is an input for prioritising which content to build first and how much of it you need.

A page targeting a keyword with 1,300 monthly searches needs more supporting blog content around it than a page targeting 40 monthly searches. Not because the second page matters less, but because Google needs to see more topical depth before it will trust a new domain with a high-volume, competitive keyword.

volume-to-priority-logic.txt

Volume Range      What It Means                          Content Priority

1,000+            High demand. High competition.             Deep cluster needed.
                  Big brands likely dominating page 1.       10+ supporting posts.
                  New domain needs time and authority.

300 to 999        Solid demand. Beatable competition         Strong cluster needed.
                  with quality content and good structure.   8 to 10 supporting posts.

100 to 299        Moderate demand. Often less competitive.   Medium cluster.
                  Good early target for a new domain.        6 to 8 supporting posts.

10 to 99          Low volume. Often high commercial intent.  Focused cluster.
                  Worth targeting if buyer intent is clear.  5 to 6 supporting posts.

Under 10          Very niche. May still be worth it          Evaluate carefully.
                  if the buyer value per conversion is high. Single post may be enough.

 

This framework shaped the entire cluster structure for the project. The legal cluster targeting 1,300 searches got ten posts. The AI training cluster targeting 40 searches also got ten posts, but those posts are written differently. More specific, more technical, more conversion-oriented, because the person reading them is further along in their decision.

The AI training cluster targeting 40 searches also got ten posts, but those posts are written differently

Intent Is More Important Than Volume

Search volume tells you how many people are searching. Search intent tells you why. Getting the intent wrong is worse than targeting a low-volume keyword because it means you are attracting the wrong people even when you do rank.

Every keyword in the Tiger Tail mapping got an intent classification before it was assigned to a page. The classification is simple but it matters every time.

search-intent-classification.txt

Intent Type       What the Searcher Wants                 Right Page Type

Informational     Learning about a topic.                     Blog post.
                  Not ready to buy yet.
                  Example: "what is ai readiness assessment"

How-To            Looking for a process or steps.             Blog post or guide.
                  Example: "how to automate workflow"

Commercial        Researching providers or solutions.         Service or industry page.
                  Getting close to a decision.
                  Example: "ai strategy consultant"

Comparison        Evaluating options.                         Blog post or landing page.
                  Example: "make vs zapier vs custom automation"

Transactional     Ready to buy or contact.                    Service page with clear CTA.
                  Example: "hire ai implementation consultant"

 

A keyword like “what is an ai readiness assessment” is informational. It belongs in the blog as a post that educates the reader and links to the service page at the end. A keyword like “ai readiness assessment” with no qualifier is commercial. Someone typing that is likely comparing providers. It belongs on the service page itself.

Those two keywords look similar. They would land on completely different pages in a well-structured site. Getting that distinction right is what separates a site that converts from one that attracts traffic that never does anything.

Getting that distinction right is what separates a site that converts from one that attracts traffic that never does anything

Putting commercial intent keywords on blog posts and informational keywords on service pages is one of the most common ways content strategies fail quietly. The traffic numbers look fine. The conversions never come.

The Before and After of Keyword Mapping

Here is what the approach looks like without mapping versus with it:

before-vs-after-mapping.txt

WITHOUT KEYWORD MAPPING

"Let's write a blog about AI for law firms."
"Let's write about what an AI consultant does."
"Let's cover AI pricing."

Result: Random posts. No page authority built.
        Service pages get no support.
        Blog competes with its own pages.
        Nothing ranks for anything meaningful.


WITH KEYWORD MAPPING

"ai for law firms" (1,300/mo, commercial) → /ai-for-legal service page
"how small law firms use ai" (informational) → blog post in legal cluster
"ai contract review" (informational/how-to) → blog post in legal cluster
"legal document automation" (170/mo, commercial) → /ai-for-legal page
"ai and billing ethics law firms" (informational) → blog post in legal cluster

Result: Service page targets commercial keywords.
        Blog cluster builds topical authority around it.
        Every post links back to the parent page.
        Google sees depth and relevance. Rankings follow.

 

The difference is not subtle. In the first approach, a business is just publishing. In the second, every piece of content has a specific job to do and a specific place in the architecture.

he difference is not subtle. In the first approach, a business is just publishing. In the second, every piece of content has a specific job to do and a specific place in the architecture.

What Good Keyword Mapping Produces

By the time the keyword mapping was done for the Tiger Tail project, every page on the site had a clear primary keyword, a confirmed search volume, an intent classification, and a list of supporting blog topics that would feed it authority over time.

That groundwork meant every brief written after it had a reason to exist. Not just “here is a topic someone might find interesting” but “here is a keyword a real person searches for, here is the page it supports, here is how it fits into the cluster that will eventually rank the parent page.”

Keyword mapping is not a research exercise. It is a structural decision. It determines what gets built, where it lives, and what it is supposed to accomplish. Every hour spent on it saves ten hours of rewriting content that landed in the wrong place.

What Comes Next

With the keyword map in place, the next step was research. Not the generic kind where you read a few articles and summarise them. Proper data-backed research using Perplexity Sonar that produced real statistics, named sources, and proof points for every single post across all 110 briefs.

That process is what I cover in the next post: how I use Perplexity Sonar to research blog topics with real data.

If you want to talk through what keyword mapping would look like for your own website, book a call. I can usually tell within the first conversation whether a site’s content architecture is working for it or against it.

See how I approach SEO strategy →

Book a free 30-minute call →

Dhruv is an SEO consultant working with business owners, founders, and agencies. If organic search is not delivering for your business, this is where to start.

dhruv-seo.online

If you have not read the earlier posts in this series, start here to understand why most blogs fail and here for the competitor research approach.

Two Problems That Are Actually the Same Problem

The first problem is not knowing what to write about when competitor data is not an option. Either nobody in the niche is blogging with measurable results, the industry is too specific for competitor keywords to be meaningful, or the business simply wants to create content on its own terms rather than chasing what others are ranking for.

The second problem is that even when topic ideas exist, they never become a consistent publishing schedule. A blog calendar gets created in a meeting, lives in a Google doc for two weeks, and then quietly disappears. Publishing becomes irregular. Months go by. The blog never builds the compounding value it was supposed to.

These two problems look different on the surface but they come from the same place: there is no system underneath the content. The persona approach solves both at once. It gives you a method for generating months of relevant topics and a calendar that is specific enough to actually use.

Why Persona-Driven Content Works Differently

Keyword research tells you what people are searching for. Persona research tells you why they are searching for it and what they actually need when they get there.

Both matter. But for building long-term authority and genuine trust with your audience, persona-driven content wins. It speaks directly to the person behind the search rather than just matching the query. Readers feel understood. That is what makes them come back, share the content, and eventually reach out.

Content written without persona thinking tends to feel generic even when it is technically accurate. It covers the topic but it does not resonate with anyone in particular. It gets read and forgotten. It builds no relationship and no trust.

A blog that speaks to a specific person with a specific problem will always outperform a blog that speaks to everyone about a general subject. Specificity is what builds authority.

Specificity is what builds authority

What a Buyer Persona Actually Is

A buyer persona is a detailed profile of an ideal customer. Not a demographic summary. A real picture of the person: their job role, their industry, what their day looks like, what keeps them stuck, what they are trying to achieve, what they search for when they have a problem, and what kind of content actually helps them make decisions.

Most businesses either have no defined personas or have ones that are too vague to be useful. Something like “marketing manager, 30 to 45, works at a mid-sized company” is not a persona. It is a demographic filter. A useful persona includes the specific frustrations, the exact questions they type into Google, and the outcomes they are trying to reach.

A useful persona includes the specific frustrations, the exact questions they type into Google, and the outcomes they are trying to reach.

The good news is that you do not need a formal persona document to start. A rough description from someone who knows the customers well is enough to build on.

Why Blogs Without Persona Thinking Fail to Build Authority

The content is technically correct but feels like it could have been written for anyone. There is no consistent point of view. The topics jump around instead of building a coherent body of knowledge in one area. Readers do not feel like the brand actually understands their situation. They read, get the information they needed, and leave without ever considering the business behind the content.

Trust does not come from being informative. It comes from being specifically relevant to the person reading. That only happens when the content was built around a real understanding of who that person is.

The Full Process: From Personas to Published Calendar

Step 1 — Collect the buyer personas

Ask the business directly. Most will give you two to four personas without much prompting. What you need from each one: job title or role, the industry they work in, their biggest daily challenges, and the outcomes they are trying to achieve. If the business has never formally defined their personas, a rough description is fine to start. You are building a foundation, not a final document.

Step 2 — Use AI with Deep Research enabled

Open an AI tool that supports Deep Research mode. This feature allows the model to actively search the web rather than drawing only on its training data. That means the persona research it returns is grounded in current, real information: forums, communities, Reddit threads, LinkedIn discussions, industry publications, and survey data where it exists. This is what separates useful persona research from generic assumptions.

Step 3 — Run the persona research prompt

Feed the AI the business name and URL, a brief description of what it does and who it serves, the buyer personas, and the target location. Then ask it to research each persona in depth and return a specific number of blog topics based on what it finds. Here is the exact prompt to use:

persona-research-prompt.txt
I am building a blog content strategy for [Brand Name].
The website is [URL].
The brand [describe what it does and who it serves].

The buyer personas are:
[List each persona with job title or description]

Target location: [country or region]

Please use deep research to give me a detailed breakdown
of each persona including:
- Who they are
- Their biggest pain points and daily challenges
- The questions they commonly search for online
- The type of information they look for before making decisions
- What content would genuinely help them

After completing the research, generate [number] blog topic
ideas directly based on the pain points and questions you found.

Topics should be educational and informational, not promotional.
Format the topics as a numbered list.

Step 4 — Turn off Deep Research before the next step

Once you have the topic list, disable Deep Research. The next step is a formatting and planning task, not a research task. Keeping Deep Research on slows things down without adding value at this stage.

Step 5 — Build the calendar with a second prompt

Paste the topic list back into the AI and ask it to turn those topics into a structured blog calendar. Here is the prompt:

calendar-build-prompt.txt
Using the blog topics listed above, please create a blog
calendar for [Brand Name].

Starting month: [month and year]
Blogs per month: [number]
Total duration: [number of months]

For each blog topic include:
- The topic title
- A brief content outline covering the key points
- The target buyer persona this post is written for
- A suggested publish date

Format this as a table with four columns:
Topic Title | Content Outline | Persona | Publish Date

So I can copy it directly into a spreadsheet.

In one working session, you now have a 3 to 6 month blog calendar with clear topics, content outlines, persona targeting, and publish dates. A writer can start immediately without further briefing. A client can review it as a deliverable.

In one working session, you now have a 3 to 6 month blog calendar with clear topics, content outlines, persona targeting, and publish dates

What the Calendar Actually Gives You

The obvious output is a publishing plan. But the less obvious output is the removal of decision fatigue. One of the main reasons blogs become inconsistent is that every publishing cycle starts with the question of what to write next. That question never fully gets answered, the deadline passes, and the blog goes quiet for another month.

With a calendar in place, that question is already answered for the next six months. The only job left is execution. That shift from deciding to doing is what makes consistent publishing actually happen in practice rather than just in plans.

For consultants and agencies, the calendar also works as a client deliverable. It demonstrates strategic thinking beyond just writing. It shows that the content has a reason to exist, a defined audience, and a structure that builds toward something over time.

Why Consistency Is the Most Underrated Factor in Blog SEO

One blog post almost never produces meaningful results on its own. SEO from blogging is a compounding activity. The value builds as more posts are published, more keywords get covered, and Google increasingly recognises the website as a trustworthy source on a specific set of topics.

A business that publishes four well-targeted posts per month for six months has 24 pages competing for organic traffic. A business that publishes randomly has gaps, inconsistency, and a much weaker topical authority signal. Google notices the difference.

Google notices the difference

The calendar is not just a content planning document. It is the system that makes compounding SEO possible by turning irregular publishing into a predictable habit.

Topical authority does not come from one great post. It comes from consistent coverage of a specific subject area over time. Google needs to see a pattern before it starts treating a website as an authority on anything.

Competitor Approach vs Persona Approach — Which One Is Right

The competitor approach works best when there is proven search demand in the niche, multiple competitors are already getting blog traffic, and the primary goal is capturing a share of existing organic traffic as efficiently as possible.

The persona approach works best when the industry is niche or specialist, competitors are not actively blogging, the business wants to build a distinct voice, or the goal is long-term audience trust rather than short-term traffic volume.

The strongest content strategies use both. The competitor approach fills the calendar with high-demand topics that have a direct path to organic rankings. The persona approach fills the gaps with audience-first content that builds deeper relevance and trust over time. Together they cover both the traffic goal and the authority goal that I wrote about in the first post in this series.

Want Help Building This for Your Business?

A blog calendar built on real persona research gives you months of direction in a single session. But the research is only as good as the understanding of the audience behind it. If you want to build a content strategy that is actually tailored to your customers and your business goals, this is something I work through with clients directly.

Whether you need a full content strategy, help with SEO, or a conversation about what your blog should actually be doing for your business, book a call and we can get into the specifics.

See how I approach content and SEO strategy →

Explore done-for-you SEO →

Book a free 30-minute call →

Dhruv is an SEO consultant working with business owners, founders, and agencies. If you want a blog that actually builds something, this is where to start.

dhruv-seo.online

How This Started

The brief was not complicated. A new AI implementation consultancy — Tiger Tail, based in Montclair, NJ — had just launched their website and needed a content strategy. They serve small and mid-size businesses across industries like legal, healthcare, real estate, home services, and finance. The site had industry pages and service pages already mapped out. What it did not have was a blog that could actually build organic traffic over time.

This is a situation I see constantly. The website exists. The pages are live. But without a content layer built around what the target audience is actually searching for, those pages sit there doing nothing. Google has no reason to show the site to anyone because there is no signal of depth, authority, or relevance yet.

The goal was to build that signal. Deliberately, systematically, over 24 months.

The Starting Point: Keywords and Page Mapping

Before writing a single brief or topic idea, the first step was understanding what the site was already trying to rank for and what search volume existed behind each page.

Every industry page and service page got mapped to its primary keywords and monthly search volumes. Not as a rough estimate but with specific data points that shaped priority decisions later.

A few examples from the service pages alone:

keyword-page-mapping.txt

Service Page                          Primary Keyword                    Monthly Searches

/services/ai-audit-strategy            ai strategy consultant                      880
/services/ai-audit-strategy            ai readiness assessment                     720
/services/growth-engineering           ai marketing automation                     720
/services/custom-ai-development        ai integration services                     590
/services/ai-audit-strategy            automation consultant                       480
/services/custom-ai-development        custom ai development company               480
/ai-for-legal                          ai for law firms                          1,300
/ai-for-real-estate                    ai real estate agent                        590

 

This mapping does two things. First, it tells you which pages matter most from a traffic potential standpoint. Second, it tells you which blog clusters need to be built first to support those pages with topical authority before competitors lock in their positions.

Keyword to Page Mapping

The legal page targeting “ai for law firms” at 1,300 searches per month, for example, is a page worth fighting for. But a new domain cannot rank for that keyword by just having a service page. It needs a cluster of supporting blog content that signals to Google that this site genuinely understands legal AI from multiple angles.

Building the Cluster Architecture

The core structural decision was to organise the entire blog around topical clusters rather than individual posts. Eleven clusters in total, each one mapped to either a service page or an industry page, each containing ten posts.

Cluster Parent Page Posts
AI Audit and Strategy /services/ai-audit-strategy 10
Workflow Automation /services/workflow-automation 10
Custom AI Development /services/custom-ai-development 10
Systems and Operations Design /services/systems-operations-design 10
Growth Engineering /services/growth-engineering 10
AI Training and Enablement /services/ai-training-enablement 10
Home Services /ai-for-home-services 10
Real Estate /ai-for-real-estate 10
Legal /ai-for-legal 10
Healthcare /ai-for-healthcare 10
Finance and Accounting /ai-for-finance-accounting 10

110 posts total. Each cluster functions as a self-contained body of content on one subject, with every post linking back to the parent page and cross-linking to related posts within the same cluster. The effect builds over time: the more posts in a cluster, the stronger the topical authority signal, and the more likely every post in that cluster is to rank higher than it would in isolation.

the more likely every post in that cluster is to rank higher than it would in isolation

One post about AI for law firms is a blog post. Ten interconnected posts about AI for law firms, each covering a different angle and all linking back to the same service page, is a topical authority signal. Google treats these very differently.

The Research Layer: Where Most Strategies Stop Short

Topic ideas are the easy part. Every SEO agency can give you a list of blog titles. What separates a content strategy that actually performs from one that just fills up a blog page is the research behind each post.

For this project, every single post got its own research data pulled from Perplexity Sonar. Not generic AI training data. Live web research with real statistics, named sources, publication dates, and citation URLs.

The difference this makes is significant. A blog post about physician burnout that says “burnout is a growing problem in healthcare” is forgettable. A blog post that cites the AMA’s finding that 43.2 percent of physicians reported at least one symptom of burnout in 2024, down from 48.2 percent in 2023 but still far above 2011 levels, with a link to the source — that is a post that earns trust and ranks.

I cover exactly how I run the Perplexity Sonar research process in the next post in this series. The short version is that each cluster required a dedicated research prompt designed to return current statistics, pain points with quantified data, ROI benchmarks, and competitor content gaps. That research became the backbone of every brief.

The Publishing Strategy: Pace and Cluster Priority

A common mistake in content strategy is publishing randomly across topics and hoping something sticks. The publishing plan for this project was deliberately sequenced.

publishing-schedule.txt

# Publishing pace

Weeks 1 to 8    1 post per week on Mondays
Week 9 onwards  2 posts per week — Mondays and Thursdays
Total duration  approximately 24 months

# Cluster priority order (lowest to highest competition)

1.  AI Audit and Strategy       — establishes what the business does
2.  Home Services               — lower competition, local long-tail
3.  Workflow Automation         — strong long-tail, less dominated
4.  Legal                        — higher volume, domain has history by now
5.  Real Estate                  — competitive but authority building
6.  Healthcare                   — mid competition
7.  Finance and Accounting
8.  Custom AI Development
9.  Growth Engineering
10. Systems and Operations
11. AI Training and Enablement

 

The logic behind starting slow and ramping up is that Google needs time to learn a new domain. Publishing 20 posts in the first month on a brand new site does not accelerate that process. Publishing consistently, at a pace the site can sustain, signals stability and intent. The ramp to two posts per week after eight weeks happens once the foundation is established.

The cluster priority order follows a deliberate pattern too. Start with the clusters where competition is lowest so early posts have a realistic chance of ranking while the domain is still young. Build authority there. Then move into more competitive territory once Google has started to trust the site.

Publishing high-competition content too early on a new domain is one of the most common content strategy mistakes. The posts exist, they just sit on page eight indefinitely. Starting with winnable keywords lets early content generate signals that lift everything published later.

What the SEO Timeline Actually Looks Like

Part of building a strategy is being honest with the client about what to expect and when. Content SEO on a new domain does not produce results in the first month. Anyone who tells you otherwise is selling something.

seo-timeline-expectations.txt

Months 1 to 4
Publishing consistently. Very little organic traffic yet.
Google is learning the site. Foundation being built.

Months 4 to 6
First long-tail posts appearing on pages 2 and 3.
Some early page 1 wins on low-competition keywords.

Months 6 to 9
Meaningful organic traffic begins.
Cluster authority starts to show in rankings.

Months 9 to 12
Compounding effect begins.
Domain authority building noticeably.

Months 12 to 18
Consistent inbound leads from organic search.
Earlier posts climbing as domain strengthens.

 

This timeline is what I shared with the client upfront. Not because it is pessimistic but because it is accurate. Content SEO compounds. The value of every post published in month two does not peak in month two. It peaks in month ten when the domain has authority, the cluster has depth, and Google has seen consistent publishing for nearly a year.

The businesses that give up at month three are the ones that never find out what month twelve would have looked like.

The businesses that give up at month three are the ones that never find out what month twelve would have looked like.

The Writing Framework

With 110 posts across 11 different industries and service areas, consistency of quality was a real challenge. The solution was a master writing prompt that every post gets written through — one that carries the brand voice, tone rules, structural requirements, and humanizer guidelines, and adapts by industry.

The prompt covers things like: never open with “In today’s digital landscape,” no em dashes anywhere, every strong claim backed by a named source with an inline link, and a specific tone shift depending on whether the post is for a home services contractor or a law firm partner. Those two audiences need to be spoken to completely differently even if the underlying AI subject is similar.

I cover the full writing framework and how to build one in the last post in this series.

What This Whole Thing Actually Delivers

At the end of this process, the client had something most businesses never build: a content system with a reason behind every decision. Every post has a cluster it belongs to. Every cluster has a parent page it supports. Every parent page has keywords worth ranking for. And every keyword was chosen because real people search for it when they have a problem the client can solve.

That is not a blog. That is a compounding organic acquisition channel built to run for two years and keep delivering after that.

That is not a blog. That is a compounding organic acquisition channel built to run for two years and keep delivering after that.

110 posts. 11 clusters. 24 months. Every post researched with real data, every cluster mapped to a page worth ranking, every keyword chosen with intent. This is what a content strategy looks like when it is built to actually work.

Want Something Like This for Your Business?

If you are running a business and your blog is either not working or not started yet, this kind of strategy is what bridges the gap between publishing and actually getting found. It is not about writing more. It is about building the right architecture before the first post goes live.

The next posts in this series go deeper into each layer of the process — keyword mapping, research with Perplexity Sonar, cluster architecture, publishing strategy, and the writing framework. If you want to talk about building this for your own business, book a call.

See how I build SEO strategy →

Book a free 30-minute call →

Dhruv is an SEO consultant working with business owners, founders, and agencies. If organic search is not delivering for your business, this is where to start.

dhruv-seo.online

Why I Built This

Running a niche SEO blog while managing client work is a time problem. You know you need to publish consistently. You know Google rewards sites that stay fresh and build topical depth. But actually writing two quality articles a day on top of everything else is not realistic for most people running a real business.

I looked at the standard options. Freelance writers cost $50 to $150 per article and still need briefing, editing, and back-and-forth. Generic AI writing tools produce content that reads like a Wikipedia article written by someone who has never done SEO. Content agencies are slow, expensive, and almost always off-brand.

None of those options solved the actual problem. I needed something that researched topics properly, wrote in a real voice, structured content for both readers and search engines, and ran every morning without me involved. So I built it for AI SEO Gazette from scratch.

What the System Actually Does

Every morning at 9 AM, a Python script wakes up on GitHub Actions and runs through the same sequence for two articles: one covering a current AI or SEO news story from the past 48 hours, and one covering an evergreen topic that practitioners are actively searching for right now.

For each article, it finds a topic worth writing about, pulls deep research from the web with real citations, hands that research to GPT-4o to write a structured 850 plus word article in HTML, fetches a featured image, uploads everything to WordPress with the right categories and tags, and immediately submits the new URL to Google Search Console for indexing.

Total runtime each morning: 8 to 12 minutes. Human involvement required: zero.

In-Post Image 1 — The Pipeline Flow Place after: "Total runtime each morning: 8 to 12 minutes. Human involvement required: zero." Aspect Ratio: 16:9 "Flat design informative infographic illustration on a clean light grey background (#F4F6F8). The image shows a left-to-right horizontal automation pipeline with 8 clearly distinct stages. Each stage is represented as a rounded rectangle box with a white background, a soft drop shadow, and two elements inside: a small recognizable flat icon at the top and a short single-line text label below it in dark charcoal (#1E293B), rendered clearly and readably. The 8 stages from left to right with their labels and icons are: Stage 1: A small clock or calendar icon. Label reads: 'GitHub Actions 9AM'. Stage 2: A small search or compass icon. Label reads: 'Topic Selection'. Stage 3: A small document with lines icon. Label reads: 'Deep Research'. Stage 4: A small brain or robot chip icon. Label reads: 'GPT-4o Writing'. Stage 5: A small checklist or tick icon. Label reads: 'Quality Check'. Stage 6: A small image or photo frame icon. Label reads: 'Featured Image'. Stage 7: A small WordPress W logo style icon. Label reads: 'Published'. Stage 8: A small Google G or magnifying glass icon. Label reads: 'GSC Indexed'. Each box is connected to the next by a bold directional arrow in medium blue (#3B82F6). The arrows are thick, clean, and clearly show left-to-right flow. Every alternate stage box has a very subtle blue tint on its background to create visual rhythm without breaking consistency. Below the entire pipeline row, centered, is a single thin horizontal summary bar in dark navy (#1E293B) with white text that reads: 'Runtime: 8 to 12 minutes per day — fully automated'. The text should be clearly legible at normal blog image viewing size. At the very top left of the image, a small bold label in dark charcoal reads: 'How the system works — end to end'. This acts as the image title. The overall style is clean, modern, and informative, similar to how Zapier or Make.com explain their automation flows in their documentation. No photography. No people. No decorative elements that do not add information. Every element in the image should tell the reader something useful."

The Stack, Explained Simply

Here is every tool in the system and what it actually does:

stack-overview.txt

# THE FULL STACK

Scheduler      GitHub Actions (cron)     — runs at 9 AM IST daily, free tier
Topic finder   Perplexity Sonar API      — live web access, real current topics
Researcher     Perplexity Sonar API      — deep research with citation URLs
Writer         OpenAI GPT-4o            — structured HTML article output
Publisher      WordPress REST API + JWT  — posts directly to WordPress
Images         Unsplash API             — free high quality featured photos
Indexing       GSC Indexing API         — tells Google to crawl immediately
Language       Python 3                 — glues everything together

The most important thing to understand about this stack is that none of these tools are expensive or obscure. GitHub Actions is free. Unsplash is free. The Google Search Console Indexing API is free. The only real costs are the Perplexity and OpenAI API calls, and those add up to roughly $0.15 to $0.35 per day.

How the Pipeline Flows

The script runs as a single Python file. Here is the full flow from trigger to published article:

pipeline-flow.txt

GitHub Actions cron trigger (3:30 AM UTC = 9 AM IST)
         |
         v
WordPress JWT authentication
         |
         v
Bulk pre-load ALL categories + tags into memory
         (one single GET request — more on why this matters later)
         |
         v
FOR EACH article type [news, evergreen]:
         |
         +-- Perplexity Call 1: Topic selection
         |       returns: title, angle, source URL
         |
         +-- Perplexity Call 2: Deep research on that topic
         |       returns: 4000-8000 chars of research + citations
         |
         +-- Filter citations to authority domains only
         |
         +-- GPT-4o: Write full article as JSON
         |       input:  system prompt + research + citations
         |       output: title, HTML content, meta, categories, tags
         |
         +-- Validate: word count ≥700, FAQ block present,
         |          ≥3 categories, ≥4 tags
         |          (auto-retry once if failed)
         |
         +-- Unsplash: Fetch featured image
         |
         +-- WordPress: Upload image, resolve term IDs, publish post
         |
         +-- Google Search Console: Submit URL for immediate indexing

The Three Bugs That Nearly Broke Everything

The pipeline above looks clean now. It was not clean getting here. The system ran for several days publishing broken articles before I tracked down what was actually going wrong. Here are the three bugs in order of how much they cost me in time and frustration.

Here are the three bugs in order of how much they cost me in time and frustration.

Bug One: WordPress Was Silently Ignoring Categories and Tags

Every article was publishing with exactly one category and zero tags. The script was not crashing. No errors were being thrown. It just quietly skipped every term after the first couple and moved on.

The cause was WordPress rate limiting. The original code called the REST API once per category and once per tag, sequentially. That is roughly 13 API calls in a row. WordPress started returning HTTP 429 errors after the first two or three calls, and the code was silently swallowing those errors and skipping the terms.

the-actual-log-output.txt

[WARNING] Skipping term 'AI in SEO': HTTP 400
[WARNING] Skipping term 'SEO Strategies': HTTP 429
[WARNING] Skipping term 'Content Optimization': HTTP 429
[WARNING] Skipping term 'SEO News': HTTP 429
[INFO] Categories: 1 assigned | Tags: 0 assigned

The fix was to stop making individual API calls entirely. Instead of asking WordPress for each term one by one, the script now makes a single GET request at startup that loads every existing category and tag into memory as a dictionary. From that point, term resolution is an instant lookup with no API calls at all.

preload_wp_terms.py

def preload_wp_terms():
    for taxonomy, cache in [("categories", WP_CATEGORY_CACHE), ("tags", WP_TAG_CACHE)]:
        page = 1
        while True:
            r = requests.get(
                WP_URL + "/wp-json/wp/v2/" + taxonomy,
                params={"per_page": 100, "page": page}, ...
            )
            items = r.json()
            for item in items:
                cache[item["name"].lower()] = item["id"]  # instant lookup later
            if len(items) < 100:
                break  # no more pages
            page += 1

Result: Both articles now consistently get 5 categories and 5 tags assigned on every single run.

Bug Two: GPT-4o Was Including Its Own Instructions Inside the Article

This one was embarrassing to find live on the site. Published articles had visible headings like “HOOK”, “FAQ Block”, and “External Links” appearing as actual text that readers could see. The model was treating the numbered section labels in the prompt as headings to include in the HTML output.

The original prompt framed the article structure like this:

broken-prompt-structure.txt

1. HOOK: One or two punchy opening sentences...
2. KEY TAKEAWAYS: Use this exact HTML block...
4. EXTERNAL LINKS: Naturally embed 2-3 links...
5. FAQ BLOCK: Exactly 4 Q&A pairs...

// GPT-4o read these as section titles and output them as <h2> tags
// Result: readers saw "HOOK" and "FAQ Block" as visible article headings

The fix was to rewrite the prompt structure entirely, replacing numbered labels with “Part 1, Part 2” framing and adding an explicit hard rule at the top of the prompt:

fixed-prompt-rule.txt

CRITICAL: Do NOT output any meta-labels or section titles like 'HOOK',
'BODY', 'FAQ Block', 'External Links', 'CTA', or any numbered section
markers as visible text in the article. These are writing instructions
for you, not headings to include in the output.

Result: Clean article output every time. No structural labels, no prompt bleed-through, content reads naturally from top to bottom.

Bug Three: Word Count Kept Falling Short Even After Retries

Articles were coming in at 526 to 637 words even though the prompt asked for 850 plus. The retry sometimes made it worse, not better. The model was finishing the article structure and stopping, treating “I have covered all the sections” as the signal to end rather than “I have hit the word count.”

The issue was that “write at least 850 words” was buried at the end of a long prompt and gave the model no actionable instruction for what to do when it was running short. The fix was to make the requirement impossible to miss and give the model a specific action to take if it was under target:

word-count-fix.txt

MANDATORY WORD COUNT: The 'content' field must contain AT LEAST 850 words
of readable text (excluding HTML tags). Count carefully.

If you finish the how-to section and the FAQ and you have fewer than
850 words, you have not written enough.

Add more H2 body sections before the FAQ until you reach 850 words.
Do not truncate early. The main body section alone must be at least
600 words by itself.

Result: Articles now consistently hit 700 to 850 words on the first attempt. When the retry triggers, it produces 820 plus words reliably.

What a Clean Run Looks Like

After all three fixes landed, here is what the actual log output looked like on the first fully successful run:

clean-run-log.txt

[INFO] WP term cache: 34 categories, 39 tags loaded.

[INFO] Article: Google AI Max Now Available (649 words)
[WARNING] Quality check failed — retrying...
[INFO] Retry result: PASSED (828 words)
[INFO] Categories: 5 assigned | Tags: 5 assigned
[INFO] Published: aiseogazette.com/google-ai-max-for-search-campaigns/
[INFO] GSC submitted: aiseogazette.com/google-ai-max-for-search-campaigns/

[INFO] Article: Mastering Generative Engine Optimization (762 words)
[INFO] Categories: 5 assigned | Tags: 5 assigned
[INFO] Published: aiseogazette.com/mastering-generative-engine-optimization/
[INFO] GSC submitted: aiseogazette.com/mastering-generative-engine-optimization/

[INFO] All 2 articles published successfully. Total runtime: 9 min 42 sec

What It Actually Costs

Tool Daily Cost Monthly Cost
Perplexity Sonar API (4 calls/day) ~$0.02 to $0.05 ~$0.60 to $1.50
OpenAI GPT-4o (2 to 4 calls/day) ~$0.10 to $0.30 ~$3 to $9
GitHub Actions $0 $0 (free tier)
Unsplash API $0 $0 (free tier)
Google Search Console Indexing API $0 $0 (free tier)
Total ~$0.12 to $0.35 ~$4 to $10

To put that in perspective: two researched, structured, published, and indexed articles every single day for the cost of a coffee per month. The manual equivalent of this output would cost $2,600 to $7,800 a year in writer fees alone.

The manual equivalent of this output would cost $2,600 to $7,800 a year in writer fees alone

What the System Cannot Do Yet

I want to be honest about the current limits because this is a real working system, not a concept piece.

It does not yet handle internal linking, meaning it will not automatically link new articles to older relevant posts on the site. It does not post to social media after publishing. It does not check whether a very similar topic was covered recently. And it does not yet read Search Console performance data to inform future topic selection, though that is the most interesting thing on the roadmap.

These are solvable problems. They just have not been built yet.

Five Things This Build Taught Me

Rate limits fail silently and that is the worst kind of failure. The WordPress 429 issue ran for days before I caught it because the script never crashed. Always log every skipped item with the actual reason it was skipped.

Tell the model what NOT to do, not just what to do. The section label bug was only fixed when I added an explicit negative instruction. Describing the structure you want is not enough on its own. You also have to describe what you do not want in the output.

Give the model an action, not just a target. “Write 850 words” is easy to ignore. “If you are under 850 words, add more H2 sections before the FAQ” gives it something concrete to do. Targets without actions get approximated. Actions get followed.

Bulk operations eliminate entire categories of bugs. Switching from 13 sequential API calls to one bulk GET did not just make the code faster. It made a whole class of rate-limiting failure impossible. Whenever you see sequential API calls in a loop, ask whether they can be batched.

Read the actual logs. Several times during this build I thought I understood the failure and I was wrong. The logs told the real story every time. Assumptions are expensive. Logs are free.

Want This for Your Own Site?

If you run a WordPress site and want a content system like this built for your specific niche, your brand voice, and your existing category structure, this is something I can build and set up for you. The tools exist, the approach is proven, and the ongoing cost is trivial. What takes time is getting the prompt right for your voice and your audience.

Beyond automation, if your business needs to show up in Google search results and in AI-generated answers (AEO), that is exactly what I work on with agencies, founders, and business owners every day. SEO and AEO are not separate strategies anymore. The sites that win over the next two years will be the ones that are structured for both.

If you want to talk about your content operations, your search visibility, or building a system like this for your own site, book a call. No sales pitch. Just a real conversation about what would actually move the needle for your business.

Read more on the blog →

Book a free 30-minute call →

Dhruv is an SEO and AEO consultant working with business owners, founders, and agencies. 500+ projects. 6+ years. If organic search is a problem for your business, this is the right place to start.

dhruv-seo.online

Let's Work Together.

Ready to grow? Pick the way that works for you.

📆 Book a Call