Marshall Simmonds shares battle-tested insights from the early days of about.com and the New York Times, to the present-day challenges of attribution, content scraping, and redefining authority in an AI-first world. If you're wondering what still works in SEO, how to future-proof your content, and why "dark traffic" is making a comeback, this episode is your roadmap.
https://page2pod.com - What happens when the gatekeepers of search stop sending traffic, but still take the content? In this milestone Episode 90, Marshall Simmonds (Defined Media Group) joins us to trace SEO’s journey from white-text-on-white-backgrounds to log-file analysis, AIO, and LLM.txt—plus what publishers must do right now. He even shares the wild story of spinning up a mirror test server so Google (and others) could pound on About.com to refine their crawlers.
We dig into why watching server logs matters more than ever, how “dark traffic” is back, and why author authority and structured content still win—despite shifting consumption and murky attribution in AI-assisted search.
Marshall also takes us behind the scenes on unlocking The New York Times archive (and why it crushed for traffic and revenue), plus a pragmatic take on AIO monetization, blocking AI training vs. real-time access, and the early but evolving role of LLM.txt.
đź§ In This Episode:
• The About.com “test server” era: letting engines crawl a mirror site to improve their bots—what Marshall learned by “watching the watchers.”Â
• Why log files are mission-critical again as dark traffic surges and LLM crawlers multiply.Â
• Enterprise-grade log tooling & options: Sitebulb, Screaming Frog, Splunk—what to use when.Â
• Crawl budget signals hiding in plain sight: the Search Console “Crawl stats” report.Â
• NYT archive strategy: opening history for scale, membership, and revenue (and why it worked).Â
• E-E-A-T before the acronym: building human authority into content and authorship.Â
• LLM.txt today: yes, it’s crawled; no, standards aren’t settled; how teams are exposing it.Â
• AIO reality check: impressions up, clicks down; what’s converting (and why patience pays).Â
• Blocking AI training vs. real-time use: robots.txt limits, legal gray zones, and first-line defenses.Â
• Holistic marketing > channel silos: brand, email, social, search, Discover—working together.
If you've ever wondered where SEO has been and where it's headed, this conversation with Marshall Simmonds is essential listening.
📌 Subscribe for more insightful episodes on SEO, AI, and digital strategy.
đź’¬ Comment below: How are you adapting your SEO strategy in the age of AI and dark traffic?
đź”— Links & Resources Mentioned
• Marshall Simmonds LinkedIn
• Defined Media Official Site
• Page 2 Podcast Noah Learner episode
What happens when the gatekeepers of search no longer send the traffic, but still take the content? Marshall Simmons is the founder of Defined Media Group and one of the earliest SEO practitioners having started his career under John Ette, the man often credited with coining the term search engine optimization.
Marshall helped shape. How about.com scaled content, how the New York Times unlocked its archive? And now he's watching the rise of AI with a mixture of dejavu, excitement and apprehension. This episode traces the arc of SEO from white text on white backgrounds to log file analysis and LLM dot T XD files.
Think they're not being crawled. Think again. But the real story is about control. Marshall walks us through a pivotal moment when Google wanted access to about dot com's content so badly. They let him set up test servers to refine their crawlers. Fast forward to today, traffic is down, attribution is murky, and AI models are scraping content with little recourse.
For publishers. We unpack how this shift is forcing a reckoning. Not just with [00:01:00] tactics, but with business models and the value of intellectual property itself. For Marshall, the challenge isn't just helping clients adapt to an AI driven search world. It's convincing them that what worked before, human authorities, structured content, holistic marketing, all those things still matter now more than ever.
If you're new to this industry, you might not know Marshall's name. He's not loud on Twitter. He is not chasing personal brand points, but he is one of the people who helped build the plumbing of search as we know it. When Marshall talks, whether it's about family, SEO or otherwise, I lean in, you wanna turn up the volume for this one?
Here we go.
Welcome to another exciting edition of the Page two podcast, episode 90. We've reached a milestone. As always, I'm joined by my partner, crime at Moving Traffic Media, Joe Devita. And today we have a very special guest, Marshall Simmons joining us on the pod. 90 episodes. Can you believe [00:02:00] it? It only took me 89 to get here.
Exactly. You should have been our first guest. I dunno about that. But okay. In preparing for this episode, I went back, I'll back up even further. So before I stepped out to do Fuse, SEO full-time, you were one of my first calls just for advice and I was actually hoping that there was some sort of like email thread where I could go back and reference some of the things you said.
'cause I couldn't find my notes from that call. But I'm sure whatever you said was. Smart and useful 'cause here we are. But one of the things I found in that email thread was back in 2014, there was a testing group called IMEC. This ring a bell. It stood for Internet Marketing Experimental Collective, and it was organized by Rand Fishkin and Eric Enge eventually took it over.
And he accidentally BCC'd everyone in the group and you were one of them. And so that's how that thread came up in my research. Do you
remember any of those? All things, internet. Sometimes it's who you know, not what [00:03:00] you know. It's all about relationships at the end of the day and like you calling on me to like, how do you spin off or start your new business or even what I'm talking to my son about now who's a college or he's a rising junior.
In college, it's just like make, you gotta network and you gotta talk to everybody because it's always, it's not necessarily the conferences where you're gonna learn necessarily a new tactic. It's the people that you meet and getting a nugget from them or getting a perspective from them. And that's all I'm looking for too.
It's like, I don't know if there's gonna be one new. That I think that's a, that's a myth that there's gonna be one thing that's gonna change your traffic. I think it's gonna be something that's gonna change your way of thinking about it. And that's what those groups have always been good for. Some of them now.
I mean, there's great slack groups out there that, that have been so critical just to how I'm thinking about something and shifting that way of thinking. That's all I'm looking for. There's never gonna be a real technical tactic that's gonna change the course of our trap. It's just [00:04:00] gonna be my thoughts around it.
Yeah. We had Noah Learner on early on as well, and he runs the SEO Community Slack group. I think it's one of the largest. Yeah. So yeah. Anyway, waxing nostalgic on that. Let's go back. So you were at, you were at the very early stages of what is now SEO, and I was listening. To Shelly Walsh's podcast, the history of SEO, and there's just so much in there that I didn't know, and maybe one thing we can talk about is the experience@about.com, which seems like there's a lot of principles that we think about and have acronyms for today, like Eat that you tried to establish there, but.
And I definitely wanna talk a little bit about that. But one thing that I thought was really interesting was this mirror server that used work directly with the search engines to set up to basically fine tune their algorithms. I feel like this is information that is hidden in the depths of our industry's history.
I'd love to learn a little bit more about that and sort. Poke around [00:05:00] there a little bit.
Yeah. First off, I did Shelly's interview for my kids, honestly, because at some point they'll wanna know, they're not interested in this right now. They're not interested in what I do, but that's for them to someday go back and listen to.
'cause I was long-winded in that one, and that was by by choice. So they can go back at some day and revisit what Dad did because I went through it in excruciating detail. So I'm not gonna do that here. Finding a place. So the about server was a moment where we had relationships with the engine. I had relationships that we had built up.
When you would go to the Google dance, Larry and Sergei would come find me and talk to me because we had a hu At the end of the day, the search engines wanted directories. They wanted human interaction, they wanted a human vetted. Site webpage experience. They were trying to emulate that and they still do.
And that's what, so they would seek out me 'cause of about.com and we had 80,000 topics. We had over a million pieces of [00:06:00] content. We were one of the largest publishers out there and they wanted access to it. And so in a moment of clarity, I said, what if we set up a test server for you guys? And immediately they jumped on it.
I said, test on it. Do whatever you want. We emulated about.com over here and said. Pound on it to your hearts, and then they did. And that was a nice little freeway into some learnings. We did that with Excite. We did that with Alta Vista, we did that with MSN. We did that with Infoseek Direct hit. We did it with everybody and with Google, obviously, and just told them like, this is for you.
Go test and learn. And so we did at the same time. 'cause we got to watch the watchers.
I was just about to ask like, were they actively sharing what they were finding and interpreting or, I was gonna say, so you've had to extrapolate from what you were seeing in terms of how they're thinking about crawling.
Is that fair?
Log file analysis, which is still alive today, and that's nothing more important than watching your log files today to [00:07:00] see because there's dark traffic running amuck, jumping ahead. But that's exactly, we have recalibrated dark traffic, and that is, it's exploding again as it was in 20 15, 20 18.
Now it's doing it again because attribution is very confusing and so we are resigned to watch log files. To see what's happening from open ais, the open ais of the world, and all the LLM crawlers that are out there.
Yeah. I think let's dig in there a little bit. We use site bulb and we had Patrick Hathaway on the show a few episodes ago.
So Cool. Absolutely
that watching their evolution from local to server base. Great tool. Yeah.
We use it. It saves us a ton of time for sure. Especially the cloud-based solution. He posted, uh, basically a question on LinkedIn around like, does it make sense for them to move into the log file analysis tool space?
And there was some healthy debate around. How useful that is from your point of view, clearly there's massive amounts of useful information there. Is there a threshold of a [00:08:00] site size that makes sense to spend time there?
Yeah, a majority of our clients are enterprise blue chip publishers. We would phrase them as super big.
Now, usually that means they have the resources, but not always, because we're talking about hundreds and hundreds of gigabytes. A day of information that you would have to parse through. So it's a sample of that. So it's okay to take samples of it as long as it's a representative sample because you're right.
There are thresholds and but will do it. And that's one of the best features of buying into, but is that it does have that log file analysis bundled in. That's a real benefit. It's an expensive tool and I don't necessarily, the UI is very different from a search perspective, but the log file aspect very useful.
Screaming Frog has that version of it that you can go buy for a year for 200 bucks. That's good as well. That's a work in man's tool. A work in person's tool is, is screaming frog, and so you have to pull a little bit more. And then there's Splunk. Splunk is like the enterprise Cadillac that [00:09:00] a lot of developers, you might even already have it onboarded a lot of the time and that's a great tool as well.
Yeah. For smaller sites, one of the things we've tried to do is use the sampling in Google Search Console to get us. Maybe a rough idea of how Google at least is interpreting the site, is that, I know it's not nearly as in depth as a true log file analysis, but is there, do you think there's useful insights within Google Search Console itself, or does it really need to get Yeah, you're right.
No, most
definitely. Right. It's another signal. It is one signal and it's a very curated signal of what. They want you to see and how they want you, but it's absolutely helpful. I'm not downplaying it at all, and I wish it were more accessible. Some of the APIs have been very useful and they just opened up a new one last week that's in beta.
I find myself digging through when I'm in search console. I love search results. I love the Discover insights. The one that I really like the most that I dig into, [00:10:00] and I wish I could get at it much easier is in crawl stats within the setting. The crawl stats report is more or less where we watch just to see activity around the.
Total crawl requests, if there's any server outages, that's what we're looking at. That's crawl budget right there.
Yeah. Yeah, exactly. So I'd lo, I'd love to circle back to, so New York Times, if I remember correctly, acquired about.com and that's how you came in to that organization. Take us through what that process was like.
'cause I can imagine, right, old school journalists, ways of doing things not too unsimilar from. Big organizations today, but how did you take some of that about.com methodology that you had established into the process and start to apply it to something as, I guess at the time, archaic,
right? Yeah, it was rough.
That was an old guard, meets new guard, and the New York Times was not as a whole from the people that I experienced were not. Too excited about about.com being part of their portfolio at that point. So it wasn't necessarily easiest. There [00:11:00] was three of us that were taken from about.com, one that I hired in Matthew Brown, who's exceptional at what he does now.
And my boss and I were taken from the About team and brought into the MIT and they basically, we were charged with doing what we did for about, for the New York time, and there were. There were not a lot of believers in the NYT in 2004. In fact, we were told that we were a fad, that the internet was a fad.
Maybe it should have been at this point, I don't know. But the point being is that we had to, we had to win hearts and minds and that was not an easy task. It was rough. The, the objective there though, was to find some believers, and we did. We were able to fight the movie section. There was a believer, there was the head of real estate.
Was another one. And I think entertainment. And there were three people there that were very curious and very interested. And so we were able to get a foothold into the organization. And then as monthly numbers started to come out and you started having all hands meetings and [00:12:00] sections were being called out for the success, it started to.
Win hearts and minds win people over as success was understood and seen and others within the organization say, how are you doing this? So my job is to make them look good. That's what we all do for our clients, right? Is this make that point of contact or that organization look good. And so we did and it was easy because the New York Times, a lot of these major brands, like I like to say, is that you just gotta get out of the way of the brand.
Let it do what it, whether that's technical or editorial changes that need to happen. This one was the New York Times and we just needed to get out of the way and let it do what it, whether that's opening archives up, which is something that we did, whether it was featuring reporters or reporting. Or topics.
The New York Times has a hubris unlike any other. It drives them. It motivates them, and is why they are the paper rec.
When you report on the Civil War, you get those accolades, I guess you can say that about yourself. I
seen the article from [00:13:00] 1854 and one of the first newspapers they have in the, in, you know, at the head, the headquarters there in Times Square, and it's a New York Times.
Report about a Civil war battle from 1856 or whatever it was. It's pretty impressive. So there's like, you're right, there's like a historical archive there that is unmatched and that was one of the first things I looked at. While people weren't paying attention to me, I was looking at other ways to leverage some of the content, this vast archive, and we brought that online.
My boss and I did that. That was interesting. The reason why it's interesting is because last week on Pivot, Scott Galloway, who was on the board, was talking shit about it and you literally, he said, what a terrible idea, which. Patently untrue because from 2006 to 2012, which was the timeline that we kinda looked at, we did of like a five year projection.
We crushed it. We crushed it for traffic, we crushed it for revenue. It brought in new membership. We basically had to prove to them like the reason you should. Open up these archives is because you're getting revenue from schools and Lexi Nexus and that's it. That was the low [00:14:00] bar that we had to clear and we were able to do that, which is minimal advertisement, some membership and subscriber revenue over the period of five to 10 years.
And we knew the model and so my boss projected it out. I came up with the business plan around it and we implemented it. You have to have top down buy-in and we did, and it's still there today. If you go look at the site map New York Times site map, there it is. There is, and the LLMs were all over that.
So that was, that was getting outta the way of the brand and letting it do what it did. Like you said, Joe, it's like if they've got historical context, let's put the spotlight on it and at that point in time, ad-based clicks and ad-based revenue, that's. And some membership. That was the business plan. That was the model.
Then
what's wild about that is that it in many ways, ha, like the foundation of the business model today is all about subscription, especially with declining traffic and whatnot. There's ad revenue, but it seems like the subscription model is the foundation of how some of these companies are gonna survive.
So you were almost ahead of that way [00:15:00] back then as well without even really being able to see what the future would hold
select. That was time select in 2011 and we, there was battles internally. It was great because we were able to formulate a, like a very search friendly approach to that because they were gonna wall the garden off and our, we had to beat that back and say like, yes, sure we can, but we need search and social to have access and had to win that battle.
And that was. That was hard fought
one. One thing I was trying to understand in, in the timeline was, I'm gonna try to get my dates right. So in 2000, no, I guess in 2011 you spun out to find media from the New York Times. How did that, so you were a separate entity inside it?
New York Times.
Okay.
The New York Times always, one thing I credit them for is always, for the most part, always saying yes.
Being very open. And Martin Nisenholtz, who was the guy behind the deal who acquired about.com and he was just, he was a visionary. He is a visionary and I think [00:16:00] he's teaching at our Harvard still and just a great mind. And one thing I said to his. I was ready to spin out on my own in 2004 and when the New York Times came and said, come do for us what you did for about, I said, sure, but I want to, I wanna basically be able to consult on the side.
And his rule was, you can consult for anybody but Rupert Murdoch, that's it. That was the rule Publishing was waking up to the internet in 20, in 2004, and we saw an opportunity and so we started to find media group under the NYT and then just worked on the side as well. And then we spun it off in 2011, we kept the New York Times.
As a client till 2018, worked with your wife, John for many years as well as she took over the reins there too, and they said yes, and it was wonderful because we lifted all boats learnings that came in from clients were used in at the MYT and MYT for the other clients as well. And so just our.
Understanding of how it worked with Enterprise Prize Publishing was where we came up with that concept of just let the brand run and getting out of the way.
I always thought that was super [00:17:00] interesting because Joe and I come from Razorfish. That's where we met and. Right around the time you were spinning out the fine media is when I was actually starting my first official SEO role, which was interesting like that same year.
But at Razorfish you would, you couldn't pitch certain things if there were like conflicts of interest or you would have to resign something if you want something else, or you'd have to use the network of agencies within the holding company to cleverly distribute. Clients that were similar. It doesn't seem to be the case in publishing.
Like they're maybe a little bit more Okay with data sharing or having you work on multiple big brands. Is that a fair statement or was there some, I don't know, political challenges there?
Oh, I think there's certainly some sensitivities that we had to be aware of, and publishers are still awfully competitive and the one thing that we will never see is for all publishers to band together.
And rise up and defend their ip. Now is the best example of that. Not [00:18:00] happening. Really wish they would. We presented to IAB in 2012 that, Hey, Google is now not giving you the overlay. Click in your images. They're taking your images now and they're keeping 'em in their wall garden. You should do something about this to no response.
And we've seen that over and over where we really wish publishers would come together in some type of consortium, that's just not gonna happen. So there's inherent. If there's inherent business practices or sensitivities, like I said before, we, uh, that's, that was, that's our first rule is like we are very aware of that and aware that what con doesn't necessarily, they do not want Hearst to know.
They don't want Axle Springer to be aware of this. So we had to be very aware of those. We had to be aware of those boundaries and we're very good at it. And that's why I think that we've in instilled so much trust in the industry today.
So we talked quickly about this idea of EAT principles, but seems like that's something that you've been teaching really since.
Very early on, like way before we even had an [00:19:00] acronym for it. When you take on a new client, are there still common, like eat challenges on the publishing side that you're seeing?
Yeah. The human side of this is so important because we had 800 guides@about.com and we certainly saw that was our advantage.
Google whispering in my ear how important to have human vetted links or content or whatever. That was critical for their understanding and trust. That became very aware, and this was in 2001 when we realized that was our linchpin. So we've just carried that forward is biographies of who's writing, what is their pedigree, what is their expertise, and what's their authority in a space and should they be writing about this?
And, and, and most notably, these reporters should be and are, and are trustworthy. Google tries to put numbers on that, but in the meantime, our job is to make sure. That they have, they have as much notoriety and they have as much exposure as possible and links into their content, links into their, into [00:20:00] their network, whether that's their social network or professional network or personal network.
That became a priority early on because we knew how Google was. Was looking at it. We knew how MSN was looking at that. We knew because they were telling us how important it, and look, Yahoo and the Open Directory Project were there for a reason. Wikipedia is there for a reason, and all those directories, the one underlying factor and the rudder on all that was human power.
Wait, when you start to work with a new client, or maybe you have a recent experience, but you. You notice that they have all this great content, but they've never established any authority behind authors. They're publishing everything as like the company. This is just from the company. There's not one person that the material is associated to.
So you walk into a situation like that and do you stop and say, we need to develop. Like a slew of expertise that you can start associating all new content to. And do you also say for the thousands of pieces you published in the past, we've gotta figure out how to [00:21:00] assign some author authentic, some kind of authority to your past article, squeeze more juice out of them.
It
depends, which is, I think I, I think like 95% of our tasks that we look at, that's how I have to answer is like, it depends. It depends on who you are. A lot of the times in these enterprise networks, when you come in, you're working with them, you're gonna say like, okay, from this point forward, we're gonna do, because retroactively, I don't know how important this is, especially with LLMs at this point, that shift has sailed.
They have gone through the archives, they have looked at all your content, and I don't care if it's been gated or not. There's reason to believe that they have been everywhere and they are, they've looked at and absorbed and hoovered up everything. So from this point forward, we will typically go and say like, let's not work retroactively is one of the things we've learned is resources are always constrained, so let's work efficiently.
And if, but to your point, Joe, it's like if there is a piece of content or a section. We continue to see through log file analysis or what are through our diligence is there is an area is getting a lot of attention, [00:22:00] gets a lot of traffic. Not just search traffic, but any kind of traffic. That's where we wanna start.
That's our starting point. That's our epicenter that we'll grow out of. And then from this point forward, we're gonna implement a new strategy that will turn the ship, just nudge it in a different direction because going back and working through. Retroactively, unless there's some real big lift or problem, that's usually not the best use of our time and they certainly not their resource.
So it depends. It depends on who we're talking about. But you're right, I think that there are sections of the site and there are people that should be highlighted, and there's certainly strategic changes. Now at this point, I, I feel like for the most part, it's rare you come up against a publisher who doesn't understand authority.
Or doesn't understand how important that is. And that's what's nice about search at this point, is like we don't have to prove ourselves to anybody else. But that wasn't even up to 2018 and 2019. There were still sites that were nonbeliever. I, I went to battle with Buzzfeed because they just wouldn't come around on it.
It was just, we were talking so much shit about SEO and it was great. It was a good tactic and I get why they were doing it, but at the [00:23:00] same time, we proved like how much traffic they were getting from search while they were saying they denied it. We're not there anymore at this point. I think what we're looking at is 60% of your traffic, 40% of your traffic was coming from search.
That's changing. And so what's that's forcing a lot of publishers to do is to take this new approach or is forcing them to reevaluate a holistic marketing approach, a holistic marketing strategy, and marketers love to rebrand and rename this, but that's what GEO is. That's all it is at this point is like, wait a second, you mean what we should have been doing since 2005?
And that is paying attention to brand, paying attention to email marketing, paying attention to pay, paying attention to social, paying attention to search, paying attention to discover, to news, et cetera, et cetera. A holistic approach that supports the brand. Yeah, that's important. And all this atrophy of search traffic is doing is making us force, making us reexamine.
This holistic marketing approach from the 10,000 foot [00:24:00] level,
another like 10,000 foot question and a little bit of historical perspective, like search engines came to be 25 years ago, and they created this awesome thing for everyday consumers, and they created this. Great new channel for organic marketing and advertising, and then 10 years later, social media came about and provided this great thing for consumers and a new channel for organic marketing and advertising.
So OpenAI and Chat GPT of three years ago. Do you see this as the next evolution, or do you see it just as a consolidation of everything we need to not put all these tools together to succeed in this new era? Yeah, I like that.
One thing I can say is we have never seen consumption or adoption change this quickly.
That is absolutely, that is happening. Consumption of media is changing at a more rapid pace than we in the last two years than we've seen in the last 15 to 20. I'd be surprised if that weren't. As we [00:25:00] look back on it, is that evolutionary or revolutionary? It feels revolutionary to me, but it is forcing all of us to take a look at what is the value of our IP and how are we going to either get paid for it, monetize it.
Leverage it and we're in a holding pattern. There is not a lot, there are not a lot of answers right now. And that's been our question, or excuse me, our answer to this question, which is publishers are very nervous, they're freaking out. And we have become on our side, on the fine side, we've just become counselors and therapists from, as a writer, is my job done?
Are we obsolete? And the answer is absolutely no. We need people to report the news. We need people, it's perspective. We need opinion, we need context. We need historical feedback. There's so many reasons why, but also what are, how are publishers gonna monetize this? And the answer is that until Google figures this out, we don't know we're we are waiting.
And there's this Skunk Works team working on this at Google and days and hours away from [00:26:00] seeing ad injection into aios. Soon as we figure this out, this whole, this conversation is very different. This time next summer, we're having a much different conversation. Maybe even your hundredth episode, you're having a much different conversation because once monetization is figured out, we can finally settle down and get focused on driving traffic in the new way.
So you've been doing this long enough to remember when. Yahoo offered basically a pay for rank. I can't remember what it was called. PPI or the
PPI or PII. Something like that. Sounds
good. Yep. What are the chances Google brings that back?
I'd Small to small. Look, I, there's always gonna be paid advertisements and.
And I, I think that I'm really interested to see what the next this next iteration looks at. There's lots of rumors swirling around right now. Lots and lots of rumors of how Google's engaging with publishers and they're throwing some bones here and it's, Hey, I think it's likely just to keep the Department of Justice off their back, what [00:27:00] they've been doing to throw to publishers.
Sure we'll pay you for news coverage and whatnot, and that really hasn't played out, but it looks good on paper. Uh, I, I feel like it could be just as simple as a citation within Aios that has a nice little add button next to it could be as simple as and easy as that. That I think would be, that's where I'm, that's my projection or prediction.
We, I'd really like to know we're all in this, basically this. Holding pattern, this gray area that we don't know what the future looks like. We just know that search is not gonna be the main contributor that it was before.
Oh, I wanted to go back to a comment you said around the, what's the value of my ip?
And I find that question fascinating because all the publish, to your point earlier, like the publishers collectively haven't come together to define what that value is. So how do you, maybe this is a rhetorical question, but like how does a publisher. Look at their IP when there isn't a consistent, I dunno, threshold of what its value is, [00:28:00] right?
Because everyone's treating it differently. Some are like, I'm not gonna give it away because I think it's super high quality. Others are like, yeah, lms take everything you want. And so that diminishes the value. So how do you, like, again, it might be a rhetorical question, but how do you get to a, what's the value of my content?
Does that make sense?
Yeah, it does. And the other problem is that where last year, a mid-tier publisher could actually, they could form a partnership with OpenAI. Now you cannot. Now the, now these mid-tier publishers, patently getting ignored, not invited to the party at all to have this discussion of like, Hey, would you, could you potentially pay us for all the crawling and all the.
The ll, the LLM activity that we're seeing, learning on our content, that ship is sailed too open. AI is not doing deals unless you are a big time publisher now and you can see this. You, you? Yeah, pretty much. I think that's safe to say, but if you go look at their, at they, they were keeping this. Updated pretty regularly in the past where it was a [00:29:00] pr, it was a promotional PR tool to say like, Hey, we just did a deal with Axel Springer.
Hey, we just did a deal with Hearst. Hey, we just did a deal with Meredith. They've gotten quiet about that because they got it, and they don't need the mid-tier publishers before for this traction as they needed before it puts. A lot of publishing in a tough space. Now you see the things like Tobe. You see what Cloudflare's doing?
Charging for? For crawling. Ah, that would be, I'm all for that because now's the time. Now's the time for publishers again to come together. It will not happen. Will not happen. That is an SEO myth. And that if everybody would just finally stand up to the Googles of the world, it's just not gonna happen.
It seems like that.
It seems like they had that opportunity early on and they didn't jump on it, and so it's like that chip has sailed. So now that this LLM. Scenario has come about there. There isn't, there isn't a collective that can just rise up and
Yeah. And now that everybody's operating from a basis of fear [00:30:00] because revenue is dropping so quickly, search traffic is dropping so quickly as consumption changes.
So now from a place of fear, they'll do anything. And it's not desperate, but it's different. And we haven't seen this, this is fascinating as far as like historically and what we've seen since I've been doing this since. Mid 97. This is fascinating to watch and how it plays out and which is why I really wish we could get to the next, so we could get to to work.
'cause now there's reports like, hey, 78% of all aios come from the top 10 result. We're doubling and triple and down on SEO best practices. But meanwhile all it's doing is inflating impressions. We're at the party, but we aren't seeing that traffic convert. If you are like a product review site, we have seen something.
Shift. That's pretty interesting. And that is if a user clicks through that person, that journey is much more likely to convert. Where search was usually shooting everybody to the top of the funnel. [00:31:00] Now we're seeing some interesting mid funnel, lower funnel conversions happening that we've never seen before.
You just have to be more patient because it's now cut by a third where it used to be 66% of the traffic would come in. Now it's. It's two thirds less, but it's converting higher. So yeah, it's, that's an interesting, that's an interesting exercise that we're having to go through.
The journey for e-commerce is gonna be a little bit quicker and easier than for publishing.
Unfortunately, if the conversion rates are, you get a little less traffic to your e-commerce site, but that traffic converts a little better. Maybe you can hold your head above water for a while, but for the publisher, you've gotta, you've gotta, are you blocking an LLN? Seems so silly to block the block.
Are you recommending that? How do you weigh that up?
Yeah, we recommend blocking them from learning, keeping it open for real time. Now, there is a rub to that because if you read what the Department of Justice [00:32:00] said or had Google say on the stand, which was that, Hey, what happens if you. Restrict access to the LLM.
Are you still using that in real time search? And Google said, yeah, are you still using that in their generative search? And Google said, yeah. So look, robots is just a strong suggestion and we know like for a long time, perplexity was not. Paying attention to, or was just patently ignoring any type of directive in robots.
So we've got no recourse here. There is no law that is preventing Google from doing what they're doing. And when pressed, they'll say like, yeah, we're using that. We're using that to train our LLM. That's why we look at the toll bits and the cloud flares of the world as like the first line of defense.
That's the moat. Right now,
since we're on the. The robot txt conversation. Do you have an opinion about the LLM txt file? Like, or maybe not even love to hear it, especially with your review of log files. You probably have a better sense of whether it's actually getting crawled and [00:33:00] used, right?
Oh, it's getting crawl for sure.
It's getting crawled, but what's interesting is that we're still haven't come up with a standard. Right? I bet you. In six to 12 months we've moved on. This one is, this is our first shot at it and we have a couple clients that spun it up pretty quick. Yost has it now. You just click the button and boom, it's there.
What is, we read it. I read an interesting article yesterday about, okay, now you have it. How do you promote it? How do you promote this? And there was a, there is an interesting. There was an interesting article from archer edu.com that talks about like LLM text files being implemented across the web, and they saw the same thing, which is like, put it up.
We saw it just get rabbit, but how do you promote it? So it came down to like, they wanted to promote it through. They came up with a real smart way of like, basically. Putting it in a real alternate tag in the header section of the site. Real smart way to do that, to basically, 'cause we know that the, we know that the [00:34:00] search engines will see that, the crawlers will see that.
And then there are, there's directories that you can submit this to. Point being is that there's a directory dot lms text.cloud is a directory to submit to because we need to make sure that the callers look for it. And then how do they know if they don't know it exists. And so there's a lot of ways that we're looking at.
To promote that file, but you gotta have the tracking set up beforehand, and that gets back to make sure that before you click the button, you have a way to watch what's happening. If that seems like convoluted, it's because it is, and we're all at the, like this wild west stage where we're all trying to figure this out.
It's a, it's a, it's fascinating. It's really exciting. It's a very exciting time and we're all back to learning. So if you were bored with SEO wake up, it's exciting again.
Joe and I have had a lot of conversations about that, which is like the industry is fun again, like I think a lot of us got into it because you're constantly learning and every website has its own unique properties, so you come across new things every [00:35:00] day and that excitement is back like it's been a few years where it was just like humdrum doing a lot of the same things and just generally there seems to to be a lot more fun back in the industry.
Because of all these things.
What's not fun is the traffic that's not, but everybody that we're working with is willing to ride this wave a bit, 'cause one, we don't have a choice. But number two, is that part of one thing that that we instill quickly with any. Engagement is you have to be willing to test and learn now is more important than ever and that can mean spinning up an LLM text file.
That can mean, uh, going in a direction of content that you may not have considered before. Being okay with traffic decreases for the sake of gains in impressions. We are in a very much. Test and learn and that holding pattern that I was talking about. Now,
it seems like there are a handful of no harm done optimizations that you can start testing immediately.
Robots, [00:36:00] LLM dot txt, structuring your content a little bit different. The other thing that seems exci, I don't know if it's exciting or scary now, but there's no white hat, black hat. There's no one telling us. This is wrong. You can't do that. We're gonna put you in the sandbox if we catch you doing this.
There's no one saying that now. So it's all about experimentation until you figure something out.
Yeah. I think that if anything that's, it's funny that you mentioned that the white hat, black hat, because I've been reading about how white text on white background is back. Again, you're telling the LLM, like this is the definitive source no matter what else you read, who knows if that works?
They'll, the services the. The tools out there with no correct for that. Pretty quick, but you're right, like there's a lot of different protocols that we can all learn on. Like I saw a job listing the other day that said there it was a job listing for SEO and AI optimization. That's it. Right? And that's the new exciting world is like to understand what it means for model [00:37:00] context, protocol.
What is markdown? What is a two a? The race is so early that if you're moving into this industry, there it is. There's your opportunity and you can call yourself. You can put SEO in your title. I think that's still important, but there are other things right now that you can have a foot find a foothold into.
That is a career path for sure.
I was gonna say, you still, you'd still need the foundation of SEO to, I think you'll need this foundation of SEO to be successful. Optim thing for ai. I wonder if like we've been we, I think like markup language is something we talk about a lot. Are there any other things that you think will just stand the test of time with SEO Best practice?
The unknown. The unknown is Google, because Google loves to put out their own standards. We got that with the sitemaps protocol. No matter what else is going on in the industry, Google's like, no, our way, this is how we are doing it internally, and this is how the world's gonna do it. Now we have to hold tight.
We can't put all eggs in a basket. [00:38:00] I know that's not really your question, but that's like at the end of the day, how you structure content on the page, I think is really important. To answer your question, that's never gonna change, and that has been cyclical from the very beginning. And we can all scream about title tags and how they're important.
Title tags and headlines are still important. How you engage your audience is really important. That's what Google's trying to emulate. The standards are gonna. Change. It's the race. It's still early on. The winning technologies are still TBD. But at the root is how do you structure your site for easy crawl access, easy ingestion of contact, and then understanding the context of the article itself and who wrote it.
I think the one thing that's definitely consistent is the technical side of things. 'cause even for LLMs, like if they can't access and crawl the content, you're still at a disadvantage. So it seems like. At a minimum, there's a level of technical traditional SEO requirement to just have that content discovered.
You mentioned geo earlier and definitely not interested in like sending [00:39:00] you down the debate of what the new acronym is, but again, in in doing research for the podcast, you were employed by John o Audette, who takes credit for the SEO acronym. Whether that's true or not, I don't know, but do you do. Like when you go into a pitch or when you're talking with your clients about SEO and ai, are you using a different acronym or are you still referencing it?
No. Okay.
Yeah, I try to dispel all that noise. I don't give a shit about the new acronym. I really don't. Because again, but what you're saying is important because. Ultimately that noise and like you said before, the technology changes that my friends, that is job security right there because clients are looking for is to help understand and somebody translate the noise for and somebody.
Help me because we are, there's a lot of distraction happening right now, and that is, that is opportunity for us to give some signal from that noise. And that I think is all of our [00:40:00] opportunity, is to help translate and distill down a message. And a lot of it is that nothing has changed right now. It really hasn't.
From, from an SEO practitioner standpoint, the world is absolutely shifting below our feet and around us, et cetera. And consumption is changing. No doubt, but what we need to be focusing on right now is this and this, and it's technical and it's editorial, and it's managing the website and it's tracking.
That has not changed. The noise around us has, and everybody's screaming like, what are we doing about ai? What are we doing about a IO? What are we doing to make sure that we're future-proof? Job secure, and bless Googles. Little wicked heart for doing that and changing the direction every six to 12 months because it has kept us employed and put roofs over our head.
And it will continue to do that. And there will be atrophy. There will be big changes. Absolutely. For sure. And marketing budgets are gonna shift. They have to. They have to change where everybody was all in on search. It's not gonna be that way. That's what we have to adapt and adopt to quickly. I think that would be [00:41:00] my advice to others is to realize that you're not gonna be the most important person at the party anymore.
I know we're getting close to wrapping up, but I wanted to, I was at the Moscon where you presented around like this concept of dark traffic and you've been preaching that. Really ever since, and probably before that presentation even, it's certainly grown with lack of referral data from LLMs and not getting referral data from when someone clicks out of an AI mode experience, all those sorts of things.
Are you, how are you trying to close that gap and present that out to clients today given that it's even more complex than it was just with social? Have you figured out a way to represent that?
Yeah, it's perspective right now because for search, 25 basis points I think is all the traffic that's coming from open ai.
That's not a lot, that's 0.25% of traffic is coming versus the rest of it coming from Google. And so the perspective is we've got time to prepare, but more importantly, it's like I think you can track open AI refers and that's [00:42:00] important and there's a lot of tools that will do it. Talk about the Wild West.
We are seeing this next wave of tools that is upon us and we are being flooded with that right now. And that's cool. Right. Going through another profound demo in a couple hours to see how is it gonna work for this client. And I think we need to get reporting in place. We've got time, we can track log file analysis that hasn't changed.
And so we can prepare. And that's what this we can be doing in this holding pattern right now is how are we going to. Track attribution, are we tracking activity? That's really all there is to be doing right now. From that perspective, from a reporting perspective, and then going through all the demos of what's right for you, is it like, is it tobe?
Is it gumshoe? Is it profound? Is it an Adobe's LLM optimizer that's coming out and that's. White glove, Cadillac, Athena hq. There's so many different tools that we could be evaluating right now. We should be to see what works. Preparation
that. That might be a quick segue [00:43:00] to, so you're doing some venture activities now, which I guess is your most new venture and.
How do you apply your SEO knowledge to evaluating? I don't know if any of the tools that you mentioned are based in Boise or not, but like how do you apply like that background to evaluating a company? Or is it more that you're using just your experience building a business to evaluate these companies?
Is there any parallels there?
Yeah. Who hasn't sat? SAT still is private equity and venture capital that has been ex. Extraordinarily active, and it's not, we don't see it in the public. You hear about it and you hear about it at the local level. Like, oh, all of our plumbers are now owned by one consolidated company, or my HOA or roofers, or, I don't know, our smoothie store down are all by one.
This is happening, and I do. We've been working in that space, in the diligence space for pre COVID and. Working with some other [00:44:00] houses that come to us and say like, Hey, we either need to check the box to make sure that there's no red flags, or you need to help us evaluate the cost of acquisition, the CAC variable.
And then that number is very important to like how much does it cost to get a new custom? That's the metric that they care about, and that's what we're all working towards. Doesn't matter where, as long as you get to that. CAC number eventually, and looking at and projecting that out based on changes in search, changes in behavior, changes in how Google is presenting data now, which is very important too.
Is traffic losses affecting cac? Are they gonna have to go buy more traffic and use you guys to go buy paid media? That's something, because that bites into cost of acquisition. That's where VCs and private equity, that's where they're interested in, and that's the number that rules all. There's. A lot of nuance to that, but for the most part, that's what we get involved in.
Real interesting. Real fast moving and it's happening now. Money was a lot cheaper coming outta [00:45:00] COVID, and so the, there was a feeding frenzy and that has slowed down a little bit, but not much.
You, you talk a lot about knowing your audience. I've heard, I've, I read a bunch of articles, you've. How important you believe this is.
You have to know every company. You have to know your audience, you've gotta know your audience. Do you ever take on projects just to help companies understand their audience? And I ask because we have a lot of conversations, discovery calls with new clients and say, our customer is a mom and she's got an interest in tennis and blah, blah blah.
And then when you start to like dig into the conversion data, you gotta have a really soft conversation about. The who the audience actually is. So I long-winded question. Do you ever do these audience definition projects for PE for people? Seems like you're really good at it. It starts, it always starts
there.
That is, I wish we could take longer to really refine and define who that person is, but ultimately it gets concentrated quickly and do like, tell us what to write, tell us what to write and tell us how to reach that person. Because you're absolutely right Joe. It's [00:46:00] like if there is a big course correction to be had, it needs to happen.
And it sometimes, you're right, it needs to happen. Softly gently, sometimes very abruptly when you're working, for example, with a new news publisher that's coming on the scene, and there have been some interesting new players out there. You have to quickly let them write about what they wanna write about 80% of the time, but 20% of the time, they better be looking for a foothold of authority.
Whether that is, Hey, we're gonna be this, we're gonna cover this sports beat. Maybe it's pro professional lacrosse. You've gotta find an angle quickly to establish trust, to get a foothold to then. Expand out from there. That hub and spoke, that certainly hasn't changed. They call it content fanning. Okay, again, new terminology.
Same concept. Of course, there's nuance to it and a little bit different about how the LMS interact with it. Same concept, all coming down to keyword research, understanding your audience, writing for that audience, and getting it out there quickly and accurate. Same as it ever was, if that dates me. But I love that because it's cyclical about this industry and we all have con historical context of how to [00:47:00] apply this to the new best practices.
And a lot of it hasn't changed. And refining that signal is what our job is.
One last question before we wrapped up. So you mentioned the sort of like the fan out. Are you guys, I know you're, you've been building internal tools at Define like well before. All the expansions around some of the tools that we mentioned earlier.
Are you building anything to help track and, I don't know, report on Query fan out? Because that's another new buzzword. How are you guys thinking about that today?
We are watching, so for that we are watching. News, top stories module, and we're watching Discover, and that is another signal that we can pull in, whether it's impression data or when you start to get real traffic and discovered.
The problem is it's the blackest of the black boxes. It's so opaque. The data that comes in from search console, if you can categorize that, there's some interesting themes to be discovered there, and so we are absolutely digging into that where we wanna see topics. Subtopics and entities that come [00:48:00] from that discover date and being able to analyze that, we can then feed that back into the machine to say, okay, we know that's positive and that we've reached our audience.
Now we understand what our audience and what Google establishes us as an authority for, and that's that category. Subcategory information is gold.
We usually end by ask, asking a prediction question, but you gave us your prediction of what the ad experience is gonna look like in the future. So maybe we can wrap up with maybe three or four.
Like lightning round questions, just because I had so many questions I wanted to ask. First one, were you ever a DMAs moderator, or I guess you were. I knew it. I knew it
kept putting the, I kept putting the cool badge on MMG, which was our, the company I worked for at, in 1997, and it kept getting taken.
Which is the little pong tree. But yes, I was DM for that and I was a dmma for the band Phish back in the day.
Incredible. Is there an SEO myth you wish would finally die?
Yes. Fuck yes. It's like SE o's dead. Are we done with that? Are we done? Because guess [00:49:00] what, you still, they still need us, right? They still need us.
And the other contrarian one is that, and this is the myth that won't die as a. Hey, all the publishers are gonna come together and rise up, but that's my contrarian view. The other one is more positive is the myth and SEO is debt.
Uh, and then last one, I know a lot of, you've worked a lot on Google penalties and algorithm update recoveries and things like that.
Is there one that like sticks out as like the most difficult in, can you talk a little bit about how maybe you reversed that? This is probably not a lightning round.
The product review updates for the LA over the last two years coming outta COVID and the emphasis on your money, your life. Wow, that was rough sailing because every publisher was in it and whether they were doing a crappy job of just like putting a bunch of listicles, like Hey, the 50 best things for your for Father's day, and with no context around it to what is now a very robust category.
That was rough and understanding the product review. Cycle and what happened there was, was fun, but [00:50:00] really stressful to realize that's what kind of gave rise to EEAT coming outta COVID as well, is that if you were gonna have some type of product review, you better take into account, you better be testing, you better take into account who you are.
Do you have legitimacy in this space? And. Feeding that back into some of the networks that we were a part of that wasn't fun. And that gave rise to the coupons update that some of the publishing really we experienced that last May of 24. That was, that was rough. And that was basically a tough conversation that you just had to pull up stakes that it was not gonna be a promotional opportunity anymore.
That was in publishing as well. But product review was tough, but what it did do was just. It refocused on authority and trust. I, I've
got one more, one more, John for the lightning route. 20 years ago, you could call Sergey Bran when you had a question. You could, you could you, he trusted you, you trusted him 10 years ago, maybe it was Matt Cutz.
You could you, you could solve his riddles and [00:51:00] some of the information that he was sharing with the world, you could understand it. Is there someone now from Google, whose every word you really pay attention to?
I, I pay more and more attention to our peers, to people like you, people in the Slack channels, the people that are writing and testing.
Not necessarily out there just blasting out information on, on Blue Sky or Twitter, but those that are really doing research, I find because Google. Is that is lockdown. That information cycle is lockdown and I, I get more out of these kind of conversations and conversations that are happening. Private communities, that's more important.
Marshall, this has been incredible. I know your time's valuable, so I definitely don't take your willingness to sit down with us for granted at all. Um, I know you're speaking at the Nest Conference in October. Tell the audience where else they might be able to find you social or otherwise.
I'm pretty quiet, John.
I I, that is, that's a, a bit of a time suck and our time is better focused on, like, I'll come back for your hundredth episode when you [00:52:00] have a panel of people talking about the new changes in search and congrats on 90. That's a lot of work, man. Good for you guys. So congrats on that. I would say like every once in a while I'll come outta my cave.
Shahadah got me to come out for Nest, I'll talk there. But other than that, I'm more of a consumer than on the main channels 'cause there's not much to be had for me. I have since moved on from those platforms, always willing to. Have conversations on the side and always willing to, I dunno, to have strategy sessions and, and spitball ideas.
We do. I do that quite a bit with people and that's fun. That's where I learn a lot or we learn about new potential tests.
Marshall Simmons, everyone, thanks for joining us for another episode of the Page two podcast and if you enjoyed the show, please remember to subscribe, rate and review. We'll see you next time.