GEO pre-publish checklist: author, date, Article JSON-LD
Pre-publish checklist for AI search: a visible author, honest dates, and Article + FAQ JSON-LD that matches the page. FAQ rich results ended May 7, 2026.

A page that AI can't attribute is a page AI won't cite confidently. So before you publish, the useful question isn't "which schema gets me into AI answers?" Google is explicit that no special markup does that. The useful question is whether three layers agree: what a reader sees (a real author, a clear date), what your JSON-LD says about those same facts, and whether a validator and a crawler can confirm both.
When those layers line up, search engines and AI systems can trust the page and quote it correctly. When they drift, the page reads as ambiguous, and ambiguous pages get paraphrased, mis-dated, or skipped. This checklist keeps them aligned, with the honest caveat that none of it is a magic switch.
Key takeaways
- Visible author and date are a trust and transparency layer, not a ranking switch. Google frames good content around "Who, How, Why" (Google).
- Show a prominent date and mirror it in
datePublished/dateModified; the visible and structured values must match, with no future or event dates (Google).Article/BlogPostinghas no required fields, but add the recommended ones:author,datePublished,dateModified,headline,image(Google).- FAQPage rich results are gone from Google Search (May 7, 2026). Keep FAQ markup only as honest semantics for a Q&A users can see (Google).
- Validate in two loops (Schema Markup Validator, then Rich Results Test), and confirm the live URL. aiSiteReady checks all of these and scores your site 0–100.
Why do visible author and date matter for AI search?
They matter as a trust and transparency layer, not as a ranking lever. Google's own framework for assessing content is "Who, How, Why": who made it, how, and why. Its guidance on AI-assisted content says an accurate byline is something readers reasonably expect (Google). Google does not document the byline itself as a ranking factor.
That distinction is easy to overstate in both directions. "Author and date rank your page" is too strong; "author and date are pointless for SEO" is too weak. Google's Search Liaison has said bylines aren't a ranking bonus on their own. But the pages that consistently show real authorship and clear dates tend to share the other properties of helpful, trustworthy content. The byline is a symptom of good publishing, not a cheat code.
For AI citation, the mechanism is concrete. A retrieval system can only attribute a page whose author and date it can actually identify. If the byline is buried in the body, glued to the first paragraph, or absent from the markup, the model has to guess. A guessed attribution is one it will hedge or drop. Give it an unambiguous "who" and "when," and you make the page safe to quote.
This is the spine of what AI agent readiness means: the page has to be legible to a machine before any of its facts can travel. Author and date are the first two facts a citation needs.
Get the dates right: published vs. modified
Google's date guidance is a two-part instruction: add a prominent, user-visible date, then mirror it in structured data with datePublished and/or dateModified on a CreativeWork subtype like Article or BlogPosting (Google). The page is blunt about consistency. The visible date and the structured date must match. Don't use future dates, and don't put the date of the event the page describes where the publication date belongs.
The two fields are not interchangeable. datePublished is the moment the page first went live; treat it as immutable, and never bump it for a cosmetic edit. dateModified records the last substantive change: new data, a corrected claim, a rewritten section. It shouldn't move every time you fix a typo. Refreshing a date without a real update is exactly the "artificial freshness" Google warns against.
| Layer | What it means | Common mistake |
|---|---|---|
| Visible date | What the reader actually sees in the header | Buried far from the title, or competing with ten other dates on the page |
datePublished | First publication of the page | Setting it to the event date, or rewriting it for minor edits |
dateModified | Last substantive change | Bumping it on every trivial fix to fake freshness |
| Timezone | Offset for accurate interpretation | Leaving a timestamp with no offset and hoping Google infers it |
For time-sensitive posts, write the full ISO 8601 value with an offset, like 2026-06-20T09:30:00-04:00. That leaves no ambiguity about when "today" was. Keep the byline and date out of the first body paragraph in your source order. When they sit in their own header block, a crawler won't mistake them for the article text. If you also run a news sitemap, sync its publication_date to the same value, so the signals you give crawlers never contradict each other.
What goes in your Article JSON-LD?
Start from the right expectation: Article, NewsArticle, and BlogPosting are all supported, and Google states there are no required properties. Instead, you add the ones that apply (Google). That's freeing and a trap at once, because "minimally valid" and "actually useful" are different bars. A schema with only @type validates fine and tells a machine almost nothing. The format itself is settled: JSON-LD is the W3C's standard serialization for linked data (W3C). The work is choosing which facts to include.
So aim past the minimum. Google's recommended set for an article is author, datePublished, dateModified, headline, and image, and those are the fields that carry the facts a citation needs. Here's a clean single-author BlogPosting that hits them:
{
"@context": "https://schema.org",
"@type": "BlogPosting",
"@id": "https://example.com/blog/agent-ready-checklist#article",
"headline": "An agent-ready publishing checklist",
"description": "How we verify author, date, and JSON-LD before a post goes live.",
"url": "https://example.com/blog/agent-ready-checklist",
"mainEntityOfPage": "https://example.com/blog/agent-ready-checklist",
"inLanguage": "en-US",
"image": [
"https://example.com/images/checklist-16x9.png",
"https://example.com/images/checklist-1x1.png"
],
"author": {
"@type": "Person",
"name": "Dana Okafor",
"url": "https://example.com/authors/dana-okafor"
},
"datePublished": "2026-06-20T09:30:00-04:00",
"dateModified": "2026-06-20T09:30:00-04:00",
"publisher": {
"@type": "Organization",
"name": "Example Labs",
"url": "https://example.com/"
}
}
One author, or several
Two author rules trip people up. First, every author shown on the page must appear in the markup. If there are several, each is its own object, never a single comma-joined string (Google). Second, author.name holds only the name: no "Posted by", no job title, no publisher. Add a url (or sameAs) so the person is identifiable across the web.
"author": [
{ "@type": "Person", "name": "Dana Okafor", "url": "https://example.com/authors/dana-okafor" },
{ "@type": "Person", "name": "Lev Marchenko", "url": "https://example.com/authors/lev-marchenko" }
]
Fields Google doesn't list
Fields like publisher, description, url, and sameAs are valid schema.org and worth adding for completeness. They just aren't listed among Google's Article properties, so treat them as semantic polish, not a guaranteed feature trigger. For the full map of which schema.org types to reach for first, see structured data for AI agents.
Should you still add FAQPage JSON-LD in 2026?
Yes, but only as honest semantics, not for a Google feature. The FAQ rich result is gone: Google's documentation says FAQ rich results stopped showing in Search on May 7, 2026, and that FAQ support was removed from the Rich Results Test in June 2026 (Google). The FAQPage type still exists in schema.org, and a valid one still hands machines a clean, structured Q&A.
There's one non-negotiable rule that the deprecation doesn't change: the questions and answers must be visible to users on the page. Marking up a Q&A that only a bot can see is misleading structured data, and an accordion is fine only because the reader can open it. No schema-only answers.
When a post has a real FAQ, the cleanest model keeps the article as the main node and puts the FAQ in its own node, joined in one @graph:
{
"@context": "https://schema.org",
"@graph": [
{
"@type": "BlogPosting",
"@id": "https://example.com/blog/agent-ready-checklist#article",
"headline": "An agent-ready publishing checklist",
"url": "https://example.com/blog/agent-ready-checklist",
"mainEntityOfPage": "https://example.com/blog/agent-ready-checklist",
"author": { "@type": "Person", "name": "Dana Okafor", "url": "https://example.com/authors/dana-okafor" },
"datePublished": "2026-06-20T09:30:00-04:00",
"dateModified": "2026-06-21T14:05:00-04:00"
},
{
"@type": "FAQPage",
"@id": "https://example.com/blog/agent-ready-checklist#faq",
"url": "https://example.com/blog/agent-ready-checklist",
"mainEntity": [
{
"@type": "Question",
"name": "Is FAQ markup still useful in 2026?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Yes, as machine-readable structure for a Q&A users can already see. It is not a Google rich result, which no longer shows."
}
}
]
}
]
}
Here's something you can check on this very page. Scroll to the FAQ below the article, then open view-source. The same front-matter faq list renders the visible questions and generates the FAQPage JSON-LD. One source, so the markup can never describe answers the reader can't see. The page header does the same trick for the byline and the "Updated" date, which feed the BlogPosting node directly.
How do you validate Article and FAQ JSON-LD before publishing?
Validate in two loops, because "valid schema.org" and "eligible for a Google feature" are different questions. Use the Schema Markup Validator for vocabulary correctness, and the Rich Results Test for whether Google can actually build an Article rich result. Then confirm the live page in Search Console once it's deployed.
Two tools, two blind spots
Each tool has a blind spot worth knowing. The Rich Results Test only judges Google-supported results, needs the page to be reachable anonymously, and ignores comments inside a JSON-LD block. Strip any // notes before publishing, since they aren't valid JSON anyway. The Schema Markup Validator is the one that checks the fields the Rich Results Test won't, like FAQPage, publisher, and sameAs. After the FAQ deprecation, that validator is your main FAQ test, because there's no rich-result preview left to lean on.
What blocks validation first
Two failure modes break validation before either tool runs. If the page is blocked by robots.txt or noindex, the crawler never sees the markup, so how you govern AI crawlers matters here too. And if your JSON-LD only appears after client-side JavaScript runs, the many AI crawlers that don't render JavaScript miss it entirely. Put the semantic core and the JSON-LD in the initial HTML. Search Console closes the loop after launch, though its reports are sampled, not exhaustive: they surface only supported types Google has already found.
Is there special "GEO schema" for AI?
No, and that's the most useful thing to internalize. Google is direct that there are no extra requirements and no special markup to appear in AI Overviews or AI Mode. It files "you need separate AI optimization" under myths, and adds that overfocusing on structured data is unnecessary (Google). For Google Search, "GEO" and "AEO" are still just disciplined SEO.
So a GEO pre-publish checklist looks almost identical to a good editorial one: real content, a real author, a clear date, a crawlable page, and correct Article markup that mirrors what's on screen. Structured data still earns its place. It reduces ambiguity, so a system extracts your canonical facts instead of reconstructing a fuzzier version from prose. But it's a clarity layer, not a backdoor, and chasing exotic "AI schema" is effort spent in the wrong place.
The pre-publish checklist
Here's the working list. It's deliberately short, and it starts with the page, not the markup.
- A visible author (or all authors) and at least one clear, prominent date sit in the page header, not the first paragraph.
- The visible author name matches
author.nameexactly, with no "Posted by", title, or publisher mixed in. - Multiple authors are separate objects in the
authorarray, never one comma-joined string. - Each author uses the right
@type(PersonorOrganization) and carries aurlorsameAs. -
datePublishedanddateModifiedare ISO 8601, ideally with a timezone, and the visible date matches them. -
datePublishedis never rewritten for cosmetic edits, and the page isn't artificially refreshed. - The
Article/BlogPostingnode carries at leastheadline,author,datePublished, and a realimage. - The
imageis the article's own image, crawlable and indexable, not the site logo. - Any FAQ marked up as
FAQPageis genuinely visible to the reader; no schema-only answers. - JSON-LD comments are stripped, and the page is reachable anonymously (no
noindex, no block). - The markup lives in the initial HTML, not behind client-side JavaScript.
- It passes the Schema Markup Validator and the Rich Results Test, and the live URL checks out in Search Console.
How aiSiteReady checks your whole site
Hand-checking one template is doable; checking every template, on every release, across the visible layer and the markup beneath it, is not. That's the job aiSiteReady does. It fetches your pages the way an agent would and reports on meta and structured data (JSON-LD, Open Graph), HTML without JavaScript, robots.txt, sitemap.xml, AI-crawler rules, and protocol discovery. The output is a single 0–100 score with blockers and prioritized fixes.
This maps to the discoverability and content-accessibility checks in the score; the exact checks and weights live on the methodology page. And the scanner practices what it preaches. The article you're reading ships a visible byline, a dated header, and BlogPosting plus FAQPage JSON-LD generated from the same honest front-matter. It's all server-rendered, so a crawler that runs no JavaScript still sees it.
Run a free scan to see whether ChatGPT, Claude, Perplexity, and Google AI can read your author, date, and JSON-LD before you publish — in English, Ukrainian, or Russian.
The short version: don't stuff a page with schema for "GEO". Build a transparent page where the reader sees the author and the date, and let honest JSON-LD mirror that reality. For AI search in 2026, that beats any attempt to game a structured-data feature that no longer exists.
IMozz has 20 years in software development, with the past year spent building with LLMs. He builds aiSiteReady, a read-only scanner that checks whether AI agents can read a site. It server-renders its own content as a working example.