UPROER TECH SEO COHORT WEEK Two – What Tech SEO really is

Episode Summary.

The second episode from our technical SEO cohort is officially live!

 

In this episode, Carmen Rodríguez explains the importance of technical SEO, focusing on making sure search engines can find and read your pages easily. She highlights how internal links help organize your site and how anchor text (the clickable words in a link) gives context.

 

Carmen also discusses that too many redirects and 404 errors (broken pages) can confuse search engines, so it’s best to fix them. She covers the use of robots.txt and sitemaps to guide Google’s crawler and explains the importance of canonical tags and meta robots to avoid duplicate content.

 

FCDC Cohort Sponsor.

A huge thank you to Uproer for sponsoring our 5th Technical SEO cohort and making these trainings accessible to BIPOC folks in developing countries!

 

Uproer is a search marketing agency that t provides search engine optimization (SEO) and search engine marketing (SEM) services to ecommerce and SaaS companies.

Uproer’s services include: 

  • Search strategy: Uproer helps clients develop outcome-driven search strategies to grow their pipeline or drive revenue 
  • Ad campaign management: Uproer manages ad campaigns for clients 
  • Resource planning: Uproer helps clients plan their resources 
  • Tactical roadmaps: Uproer helps clients create tactical roadmaps that prioritize SEM opportunities 

 

Elevate your online presence and reach your growth potential with Uproer.

 

Teacher’s Profile.

 

✍🏾Name: Carmen Rodriguez Dominguez

✍🏾What Carmen Does:h at 

✍🏾 Company: 

✍🏾Noteworthy:  Carmen has over 7 years of experience in digital marketing, specializing in content, translations, and digital PR before focusing on SEO.

 

 

Connect with Carmen;

Resources.

 

Key Insights.

🗝️ Understanding Crawlability and Indexability

Carmen explains that crawlability refers to the ability of Google’s bots to access and navigate through the pages of a website. This involves the crawler’s ability to spend time on a site’s pages, jumping from one to another to gather information.

 

Indexability, on the other hand, determines whether a page can be stored and displayed in search engine results for user engagement. Both are crucial for SEO success, as pages that cannot be crawled or indexed will not appear in search results, regardless of their content quality. Carmen also highlights the importance of the page rank algorithm in Google’s core algorithm, which plays a significant role in determining a page’s importance.

 

🗝️ Page Rank Algorithm and Its Impact

 

The page rank algorithm is a core component of Google’s search algorithm, used to identify the importance of a website. Carmen explains that this algorithm evaluates a website based on its site structureinternal linkingbacklinking, and user engagement (such as clicks).

 

Pages with more internal and external links are considered more important. User engagement, particularly clicks, also affects page rank. Carmen provides examples to illustrate how different levels of internal and external linking can result in high or low page rank, emphasizing the need for a well-structured site with strategic linking.

🗝️ Internal Linking and Its Role in SEO

Internal linking is crucial for establishing the hierarchy and importance of pages within a website. Carmen discusses how internal links help Google understand the structure and significance of a site’s content. She explains that breadcrumbs can improve internal linking and enhance user experience by making it easier for users and crawlers to navigate the site.

 

The significance of anchor text is also discussed, as it provides context to Google about the linked page’s content. Carmen advises against using non-descriptive anchor text like “Read More” or “Click Here,” which do not convey meaningful information to search engines.

 

🗝️ Redirects and Their Impact on SEO

Carmen explains that redirects can significantly impact crawlability and indexability. She advises against using multiple redirects, as they can waste crawl budget and create redirect loops, which confuse crawlers and hinder page indexing. Instead, she explains the importance of using single redirects to maintain a clear and efficient path for crawlers.

 

Carmen also highlights the need to fix 404 errors, as they can negatively affect page quality and user experience. Ensuring that all redirects are properly managed helps maintain the integrity and accessibility of a website’s content.

🗝️ SiteMaps and Robots.txt Files

 

Sitemaps and robots.txt files are essential tools for controlling crawlability. Carmen explains that a well-structured site map guides Google’s crawler to the most important pages, ensuring they are crawled and indexed efficiently.

 

The robots.txt file can be used to allow or disallow crawlers from accessing specific parts of a website. Carmen provides examples of how to use robots.txt to block certain pages or direct crawlers to prioritize important content. Proper configuration of these files helps Google use its crawl budget effectively and prioritize the most critical pages of a site.

 

🗝️ Canonical Tags and Redirects

 

Canonical tags are used to indicate the primary version of a page when multiple similar pages exist. Carmen explains that canonical tags help prevent duplicate content issues by signaling to Google which page should be considered the main one. She discusses common issues with canonical tags, such as Google ignoring them if it determines another page is more relevant.

 

Redirects are also important for directing users and search engines to updated or relevant content. Carmen advises on when to use redirects and the considerations involved, such as ensuring that redirects are properly managed to avoid issues like redirect loops.

 

🗝️ Meta Robots and NoIndex Directives

 

Meta robots tags are used to instruct Google on how to handle specific pages. Carmen explains the difference between noindex and nofollow directives. Noindex prevents a page from being indexed, while nofollow prevents search engines from following links on a page.

 

Carmen provides examples of how to use meta robots tags to prevent pages from being indexed, such as pages that are not intended for public view or are duplicates. She emphasizes the importance of regularly checking meta robots tags to ensure they are correctly implemented and not inadvertently blocking important content from being indexed.

 

Episode Transcriptions.

carmen domínguez rodríguez 0:17

Okay, I’m going to start already, and then, if not, he can watch the video. So last time, we went through how Google works and how Google understand content, in general, links, a top level view of how a crow gets to your site and tries to understand everything we today, we’re going to be to be doing a bit of a deeper dive into the parts are concerned the most technical SEO and the usual issues that we find when we are doing technical audits.

 

We’re also going to be particularly focusing on index ability and crawlability, because those are where the areas that cause most issues, although I told you last time, that’s the most difficult to identify when there is an issue is with rendering. So also, like last time, feel free to ask questions or ask me to explain anything, because this part is rather complex, and probably it will take you a few times to go through it to understand many of the things it took me forever.

 

So don’t feel don’t feel bad if you don’t. We also going to go through a Google search console who everyone should have access, and the template I mentioned just to say in the task, we also going to redo the task in after next class using a tool by one two tools, set bulb and scripting, from which you should already have access to If not, please make sure that you do for next time. But today, we’re only going to be focusing on Google Search Console.

 

So we already said last time that the main goal of technical SEO is to make sure that you site and content is visible and easy for Google to understand. Because if Google cannot find the content and understand which parts of the content are important, Google is never going to show show your content on top of your competitors.

 

So if you technical setup is not correct. Your pages are not showing the right relevance and the right importance for Google, or is too confusing, Google will actually discard them. And I saw last week, we talk about the index it and how Google was the indexing pages.

 

Now they they’ve been a few SEOs that have done some tests, and it’s proved that if, after 123 days, Google has not been able to identify your your page, or no user has gone to your page, they will beat the index. So something that it was a suspicion in the last eight months now is confirmed is happening.

 

So that’s why making sure that the crawler can actually get to your your page is paramount. So this is just, again, a little reminder of the different steps that the crawler does. This is the order that it does, but it can get stuck in any part, and these are the different parts are really important.

 

We already talked about them last time. Today we’re going to talk specifically about internal so crawling, chlorine, internal linking and the role that it plays in identifying the importance of the of the pages and the indexing. We also touch upon cannibalization and what, how can we solve cannibalization international briefly, and we’re going to have a totally separate lesson for code, web vitals.

 

Because this one, lots of people say that it’s super important for SEO. Other people say that it’s not important at all. For me, it’s more important for the user than SEO, but at the end of the day, the user is a one converting, right? So that’s why we consider it always as a good part of SEO. So focusing on crawlability and index ability, we already explained the difference last time.

 

But just to be a little bit more specific, reliability is when a crawler or the robot of Google can actually get to your sites and spend time crawling or visiting all the different pages in your sites. So being able to jump from one to another in order to identify if it’s useful or not. Indexability, though, is whether that page your site can be used and displayed in search engines for the users to engage with it.

 

So it’s different steps of the part, sorry, of the process and two equally important ones. So how can we make sure that our site is crawlable? A very. Important concept that you need to understand is the page rank algorithm. So this page rank algorithm is one of the main ones that Google created when the search engine was created, and is one that has always states as a core part of Google algorithm.

 

It’s also, once you understand it, it’s very easy to create any sort of a strategy. So the page rank algorithm, what it does. And it does it in a micro level in your website and in them, sorry, my micro, micro level, a micro level, macro level, in the sites and in the whole of the web, to identify the importance of a website. And it does it through the site structure, internal linking, backlinking and clicks.

 

So what it says in there, clicks, not confirmed, but well known, is has now been confirmed, after the DoD trial and as well as the European trial. That’s obviously no one has made it public, but we know that clicks, or the information that Chrome retains of how your user engage with your page is part of page rank algorithm as well.

 

So if your page is crawled properly, your page, your page is shown in Google search, but the user doesn’t click on it, it will be demoted and the points will be reduced. So we can see in here that every page, page, A, B, C, it has a different measuring or points based on the amount of internal link in the has the website internally, and the external link is it has externally.

 

So based on these points, it will have more importance or less importance in the web. And this is actually this image is take, is taken from the patent, from Google patent. It’s not something that SEOs have invented. This is public, and he’s registered in Google’s property.

 

So we can see the the errors are internal, and linkings independent, links independent of external or internal that are pointing to different areas. And for example, to think of a website that probably will receive a lot of externalists and internal links will be newspapers, right?

 

So if you’re thinking of The Guardian, The Guardian will always have a very high page rank or very high domain rating. So we have all these excellent tools. I have always used the main rating, or the domain authority as as a measure they are meant to be the equivalent of page rank.

 

Only the page rank is deeper and more complex. So we can, we could assume that, for example, The Guardian will be a C in this bubble while the government page, let’s say, let’s think of the UK or the US government page, it will be b higher. Why?

 

Because it’s a very big authority, giving links to everyone. But lots, lots of people are giving links to the government any, any single person is referring to them in the web, Wikipedia, the newspapers, all of these are going to be referring to them, and we’re going to have a lot of content.

 

So that is why, how Google and page rank uses the information to decide what page is important. Why am I telling you this? Because, as I said, it applies to the macro level and then micro level. Whenever your page doesn’t have the right External links or internal links for Google to grow, Google is not going to be able to allocate a number to your page.

 

So let’s say you have a four pages are really important in your website, but none of them are in the navigation, or only three are in the navigation, but one one is not, and it doesn’t receive internal linking or external linking. Google is not going to find it. Google is not going to prioritize it.

 

That’s why internal linkings is really important. This image that you see in here that you’re going to be like, What is this? This is a three structure image, and this coming from a Screaming Frog. And this is from a real website.

 

This is from a web website called eloomi, which actually represents the homepage and the different links that go to the different parts and the relationship and connection that there is industry.

 

carmen domínguez rodríguez 9:19
If, when I share the presentation, go into these slides, get the image and open it, because you’re going to see how crazy the structure is. You will see that they are very important pages that are in the top, therefore they should receive a very high page rank, but they are no index.

 

So that means we are telling Google, this page is really important. We are putting it on the top of the tree, and then you are telling Google, no, no, it’s not important. Don’t index it. So why are we putting so high in the tree? I’m telling Google is important if we are not indexing that’s why, understanding how a site structure, the signals that we’re going giving Google with where we put our website, the links that we’re putting on, the directives that.

 

Giving Google they need to be aligned. So I’m talking about signals and directive are two different things, and I’m going to explain them. Now we will do a three crow of the site that you using for doing this. This structure to identify issues, but this is one of the most basic things that you will do as a technical SEO, the bigger your site is, the more complex this tree is going to be, and the more attention you need, you need to pay to this you remember the last time we were talking about crawling right and crawl budgets.

 

So if your tree is really confusing and it’s very difficult for Google to navigate this tree, Google is not going to use the budget wisely or effectively, so the cleaner and easier that Google can navigate it the best chances you have to get your pages ranking.

 

So the question is, how do I get my page organized nicely, and how do I help Google to navigate these pages. The classic way to do this is through internal linking, because internal linking is what Google uses to identify the importance of these pages, to add context to the pages and to navigate them. So here very simple example of how SEMrush recommends doing internal linking.

 

So we have page one that is potentially the domain the home page linking to one and two. So we’re telling Google page one and two are second or third, depending on the two that you’re using in the page, in the page rank structure. Then from one and two, we are linking to 3,4,5,6,7,8, and all of these will be four or fifth, depending on this, on the two in the Page rank.

 

So when I say four and fifth is that we are telling Google these pages are four or fifth in the in the order or importance for you to crawl. Is it making sense?

 

So if you want Google, or you want to tell Google that page three is really important, you need to make sure that so link three here in the image, you need to make sure that it has a link from one, but you also have a link for two, and you also have a links from from four, because we need to make sure that Google is always going back to this page, so the more gates it has to get to the page, the more important this page has. And now this, I’m talking about internal linking, which is the linkings, the links that you can control.

 

If to this, you also add an external link, then Google is adding extra points to the page rank, and this, again, you can represent this in the extra level of the internet and how the internet works. A very classic way to actually improve internal linking is by adding breadcrumbs.

 

And I’m sure that all of you have heard about breakups before. It also has a schema, and this breadcrumbs that you normally put in articles is a way to make sure that Google get to the long form or to the blog in an easy way, because obviously Google is getting to your site through the Robot txt.

 

It goes to the sitemap, goes to the most important pages by the time they get to the blog, because the blog rarely or the articles will really be having a link from the homepage rarely. That doesn’t mean that doesn’t happen. It would by the time that gets to the to the blog.

 

Imagine, particularly if the blog you wrote it three months ago, so it’s in Page five or your blog is going to take forever for Google to come three four times to your blog, blog to your website before it gets there. If you ask breadcrumbs, though, is going to be easier for the crawler to go back and forth and also for the user.

 

If I am in your blog and I’m reading about, in this case, the importance of feedback, and I want to go to the feedback page or I want to go back to the home page. It’s easier for me to go and click home, so I’m clicking rather than go in the arrow, back, back, back, back. So I’m telling Google or chrome, in this case, that I’m clicking in this page, because for me, it’s important.

 

So I’m adding an extra information through Chrome by Google, saying this is a thumb up. So earlier, I said internal links are important. External links are important. Satisfaction is important. Click. So click is a thumb up. It’s like whenever you’re writing in Instagram, and the more thumbs up you get, the more important your images. This is the same for Google, the clicks.

 

So backgrounds is an easy way to actually try to get this internal linking and this scroll going through, but you might have a lot of internet links, but if you don’t tell Google the context of these internal links, the value that they have are way less.

 

So this is when we talk about the anchor text and why the anchor text is so important. Right? Right? So if you tell Google, you put in a link here. This is recent articles, and the text says strategies for building a skill based organization.

 

But when you actually go to the codes, there is no anchor text because that’s in Link was added to the image, Google is not understanding that link. So Google is not understanding the context of that link, and it doesn’t understand what that page is going to be talking about. So that’s why it’s really important that whenever we are doing audits in the internal links we are looking at the anchor is super typical that we find anchors with click here, go here.

 

More information here resources have so all of these are not providing any sort of context to Google, and it’s a missed opportunity to get non brand keywords ranking. And again, we’re going to be doing a whole training on internal linking, because it’s one of those classic issues that SEOs don’t pay attention to, and is what makes Google understand your page better.

 

Another, another way that Google might not understand your page or not get to your page, and when you have a lot of redirects, or what we call a redirect loop, so redirects are giving even information to Google to say this is not the final page. You need to go to the next one to actually receive it. So is crow budget, the Google is using to go from one page to another. If you have only one redirect, that’s okay.

 

Google will spend the time if you have four redirects. So Google gets to a URL number one, then you are telling Google to go to the number two, to go number three, to go. Number four, he spends the time that he could have spent in four different URLs.

 

So it’s very important that we control the redirects and we are clean as possible, not to waste scrolling. Does it make sense? Because I think this one is a little bit more more difficult. Okay, if you don’t understand anything, just interrupt me, because I’m giving you guys a lot of information.

 

Looking into redirects is something that Google Search Console will flag to you, and it’s very typical. For example, if you have an E commerce that your product goes out of stock, and then someone in the in the company says, oh, let’s put that redirect to another article, to another product, then the product come back, but that redirect is still there, and then another redirect is added when the second product is out of stock, so you end up having Google going to four different products, because you are constantly telling so Google, at some point, will stop crawling the four products.

 

So you have four products that Google is not finding. If this is too complex, we will do a task anyway. On this, links to 404 pages it’s a classic issue as well.

 

Unzila Siddique 18:08

I wanted to meet. Let’s say to your perspective, let’s say I need to redirect my URL one to final email instead of URLs. Instead of redirecting URLs into three, if I redirect all of these URLs to find the URL, does that work as well?

 

carmen domínguez rodríguez 18:33

You need to redirect the redirect to the one that is the closest. If you have four, but they’re only going to one, it’s okay. The important is that you don’t redirect to one and the to another and the two another. So three jumps. It’s better not

Unzila Siddique 18:49

so I do not need to write a lot of redirects. I can just like, do 1?

carmen domínguez rodríguez 18:58

so it doesn’t matter of the amount of redirects that you put to one page, the important is that the jump is only one. So if you have 10 pages that you are redirecting to one that is okay. Don’t what you shouldn’t do is to put one redirect to another one, who then will go to another one.

 

So three URLs, it’s better to only have one jump, so one redirect to another. Is it clear? The normally, when you do it in one go, it doesn’t tend to be an issue. The issue becomes with time, I think you see Frozen, maybe,

carmen domínguez rodríguez 19:40

yeah, Stephanie,

stephanie inabo 19:46

thank you, Carmen. I So I’m thinking about, I don’t know if we’re going to cover that topic, but in the case where there’s a site migration and you have to use 301, redirects from like. Old URLs, new URLs. How do you avoid having issues?

carmen domínguez rodríguez 20:07

Migrations is another story. I’m from migrations. We’re going to do a only one training, because it’s a massive story. So for redirects, normally, what will you do is redirect mapping, and what you need to do is to get all the URLs that you have and the new the old ones and the new ones, and decide which one is the relevance.

 

You will always put only one so equal, unless you have three pages in here that you’re not creating a a equivalent. So you put those three into that. Normally, you do all the tests with when the page is not live yet to make sure that everything is okay.

 

And after you have done the migration, you need to leave one week to make sure that all the redirects are working fine, because sometimes Google will take long time to actually recognize the redirects. But again, we will do a training on that. Because migrations, while for me, I love them, is the classic tech SEO nightmare.

 

Because unless you are really precise and you have everything under control, it will go wrong. If sometimes even you have everything under control and it will still go wrong. But yeah, I think you have touched onto 31301, and 302 there are two types of redirects, the temporary redirects and the Forever redirects, depending on what you want to do you want to tell Google to always well, I’ve found very few situations in which I have used temporary redirects.

 

Most of the time when I use redirect is because I have thought through through it a lot, and I have them made a decision that the redirect is the only solution, and if it’s the only solution, then it’s forever. It’s not temporary, but it’s something that’s you need to pay attention to. Because if it’s something that you want to implement forever, but you put a redirect that is temporary, who will won’t take it seriously. But again, we will have a whole session about this. Thank you, JaQueen

 

JaQueen McNair 22:10

sorry for the extra redirect questions, but thank you for taking my question so not at the internship, but at my regular job. They closed down one of the companies earlier this year, but that company had, I had written a lot of blogs there that were ranking number one, that are really advantageous for the organization that still exists, and so I was trying to direct them and how to migrate those blogs over to the main company, like the larger organizations website. But my question is, should I also look for, should I also be migrating all of the internal links? Yes? Yes, yeah.

carmen domínguez rodríguez 23:03

So in whenever we do a migration, the migration training, internal links in migration are going to be one, one of the main things you want, okay, so if you are redirecting from four blog posts to new four blog posts, you don’t need to do the internal link in redirects, because Google is going to be jumping.

 

It’s not going to be spending time on the crawling, so you don’t have to. But if you are using some of those articles, and you’re putting them in a new place, so you have a new website that you are building, and you are transferring the content, you need to me make a particular interest or emphasis on the internal linking, because if the internal linking is linking to the all reader, all pages that are redirecting, you will create what is called a redirect loop.

 

That’s, does Google get into your page? It goes to the internal linking, go to the old page, comes back because of the redirect, close it again, go into the internal linking back, and what Google will do is stop crawling the page because it gets confused. So that’s why internal linking is really important.

 

But it is if you are not transferring the contents and you are just simply redirecting to new content, there’s also the question that, do you need to redirect it? But that is another question. Then you don’t need to change the internal linking in the old one. But yeah, I guess, don’t worry. You’re going to all hate me, because for the migration, I’m going to give you a very complex tax task for migrations. So don’t worry.

 

JaQueen McNair 24:44

I’m excited. I need it.

carmen domínguez rodríguez 24:47

Don’t say that. Okay, are we done with the with the redirect? So for for another classic that you will find in your page. This is very important as well. Whenever you’re doing internal linking, if you are internally linking to a page that is a 404 Google will obviously drop stop crawling that page because it’s not finding the page. And the more 404 so you have in your page, the more negative points that Google allocates to your page.

 

So you know how we said earlier that you page page rank gives a point to your page, and you have a certain amount of points, the more 404 that Google find, the less points you’re going to have, because Google is going to continue thinking that your page is of bad quality, because all it finds is 404 and it doesn’t want to show users a page that has a lot of foreign force.

 

So it’s important that you look into 404 now, of course, this is this needs to be put in context. If you have an E commerce site that has millions of products, finding 404 is only natural, because your products are going to break, because they will disappear, millions of reasons.

 

So, for example, when I used to look and work in lufantastita had a million products in each website, and we have 24 websites. What we used to do is to get a crawler that will send me a report every single day with all the 404 that they were so we could fix them.

 

And I was not freaking out when I saw it. Now, if you have a small site with only 10 products, and out of those 10 products, five of them are 404. The chances that Google is going to not consider consider you relevant are really important. So that’s why we need to pay attention into fours. Again, not saying you freak out when you see one, but you need to put it in context.

 

It’s important to fix the 404s. And I’m talking about for a force is also soft 404, which we’re going to talk about in a minute. What is a software for? Should be your question we’ll talk about. But 404s, happen for all reasons. The URL is broken. You put an internal linking with the wrong words, and so Google will create a 404 but Google says console should be able to tell you when there is one. So internal links. Summary, Google cross via internal linking, also external linking.

 

When it’s not the domains are not related. It weighs the importance of the URLs using internal links. So the more you internal links that a page has, the more important they will be. It will also place it in wherever important it is in your sitemap and robots txt.

 

But having internal linking is a good way to give importance, and is what I just said. Internal LinkedIn establishes the architecture or an hierarchy of your page. So if you have the homepage, the chances that the homepage is the one that’s going to have the more internal linking is really high, because every single breadcrumb is going to go back to the homepage, every single page is going to have a relevant link back to the homepage so on.

 

So the structure and the here hierarchy of a page will be really linked by the internal linking. I’m sorry I forgot this. And very important is anchor text. As I said, You need to give context to Google. If the context to Google is Read More is not going to end up prioritizing that page. So I think I mentioned it last time as well.

 

If you have an article and your internal linking is at the bottom of the article at the very end, and it says, Read More here, that link is not going to have priority for Google, because it doesn’t have priority for you. You have put it at the very end of the of an article at the bottom, without context.

 

So Google is thinking, Okay, this link is at the very end of everything with no context for the user. No context for me is not important for them either. So the place where you put your internal linking and the context that you put to your internal link in it does have a value recommendation.

 

Try to always put to your internal link is at the top as soon as you can, whenever the user is starting to read, and with the best non brand keyword that you can controlling crawlability. So we have gone into what has more important or less important, you can control crawlability using the Robox txt. So if you tell in the robot txt to Google do not crawl certain pages, because it should not spend time on them, it will be focusing only on the ones that you want.

 

You want Google to to grow. So just to give you a more specific example, you will always have, if you have a WordPress page, you need to have a content. How do you call it container? Which, for example, for example, WP Engine is a container where you put all your content, so then you can use it in different parts of the website, right? So. Do you want Google to actually crawl your container? Probably not.

 

You want Google to only crawl the content and the pages are important for you. Another example, in your website, you have the search option every every single time that someone searches in your let’s say it’s an E commerce. I keep using the example of E commerce because they are my favorite websites. They’re the more complex one, but you can apply this to everything. You have a search bar in your website and you type red shoes.

 

Every single time someone types red shoes, a new URL will be created, and it will be a URL that says, filter. Equal red shoes. That URL should the URL be crawl? Should that URL be indexed? Probably not, because it is not important. It’s not static. Is a, is a? I forgot the word. This is the problem not being in English.

 

Forgot the word. It’s not the contrary to static. It’s not static, so it’s not important what you don’t want Google to be crawling that. Having said that is famously Wayfair is actually crawling and indexing all the filters and try to get all the filters to rank, because they have so many products that they end up ranking for them, it’s a bit of a spammy tactic, but for them, it’s working. So you it’s up to the SEO to choose the tactic.

 

You can have a look away fair how they do it. In general, you don’t want the search to be crawl and index, because people can type wherever they want in the search. So I said ratios, but someone can type your ticket sorry for saying bad words, but you end up ranking for for bad words as well. So it’s important that you can control what gets indexed and rank.

 

Basically, yes, I just explained that, but they are more ways that you can control crawlability as well, which is exactly just to make to making sure that you always link the pages in the in the right places, so the crawl can actually find them easily with you another way. Sorry, and I didn’t put in here, and yes, it’s a side map.

 

So in the side map that you put in the robots txt. So the crawler gets to your sites, the first things that it reads Is it the robot tXt. So as an SEO the first thing that you need to do, always, whenever someone asks you to look into a website is type robot txt, because the robot CST is going to tell you all the basic information you need to know.

 

In the robots txt you need to put the site map. Because in the site map your Bible, you’re telling Google a these are all my important pages. I want you to crawl these pages first, then get lost in in my website, then discover whatever you want. But this crawl them every single time.

 

So the something that you also want to do is to make sure that you have a very clear and well structured. Sign up, this is the most important bit. You will first put your categories and your navigations. Then you will put your blog post. If you have an international site with 15 pages, you will be creating a site map for every international site.

 

So have it really well organized. And when you have done that, then you can let Google lose if that makes sense from the robot txt, you can also decide which crawlers you want to be able to crawl your site. So we always talk about Google, but something that is very important for you guys with AI nowadays.

 

AI steals your content, gets your content through the crawler. So chatgpt, they are. They have their own crawler that gets into your site. The gets crawl everything, copies the content, put it in the database, and they use that content to to then train the robots. Is that something that you want to do or not to do. There are lots of websites that they have blog, chatgpt, robots, txt, uh robots. Sorry, not txt, robots.

 

Others, I haven’t, because you might want to show up in chatgpt So you want your content to be there in case in the future, people use chatgpt to do search after you. Here you can also control which tools crawl your site. So a lot of people that grow block href or semrush because they don’t want href or semrush to have their data. So is the robotics team that you use to, let’s say, Forbid or allow people crawlers to your contents.

 

carmen domínguez rodríguez 34:41

Any questions so far, yes, where are we typing robots in here? So I’m going to show you an example. Window, no time, screen. Let’s do my screen.

carmen domínguez rodríguez 35:16

Every single website will have something like this. This is mandatory. This one is not particularly beautiful, but it’s like that, like this, because it’s Shopify. So in Shopify, you cannot create your own robot, txt. Shopify creates it for you in the way that they want. But you can see here it has this allowed.

 

So don’t crawl admin, don’t crawl checkouts. Don’t crawl the filters you see filter, don’t crawl any blogs that has also this letters and here as well, allow the collections user agent ad spot, so the ads can actually crawl the page and get information href bots, which is what we were talking about.

 

You can also say disallowed href bot if you don’t want it to clone it, etc. This is important that you that you understand the language in here is not mandatory, because you don’t to be a good technical is here you don’t need to know how to understand this, but it will help you a lot, because this is a language of itself. So for example, I was looking into into the robots txt of this one to see if it works well.

 

And I understood here that by doing this, it was allowing every every every single crawler to crawl it, because user agent equals star means everyone can crawl it. So even though it has a few of them below, anyway, because of this equal but you could also put this, oh, sorry, this and then this allow another one below, if you as an SEO only see the first one, and you don’t analyze the whole thing, you might not realize that it’s blocking something that you need, or that the company needs here the site map as well.

 

So this is how the site map looks like. And I just realized that there’s some things that can be improved here, but yeah, we here. We have the products, the pages and the collections and the blogs, so the most important pages are always here, telling Google crawl them first, and in anything that is not Shopify, you can change this with Yoast. I’m sure that you have all seen Yoast.

 

Yoast app will allow you to change it, but there are lots of them anyway, and we also going to have a training on the different types of tools or platforms that we can use and how we try to change things in each of them, going back to the presentation, okay, so what can affect index ability?

 

I talk about signals earlier, signals and directives. So signals are the things that you are telling Google that you prefer, and directives are things you are forcing the crawler not to So, for example, canonical is a directive. It’s a signal, sorry. H reflan is a signal, meaning Google can ignore it if it doesn’t agree that what you’re saying is the right thing. And no index disallowing the robot txt or redirects their directives, they are forcing Google to go in one direction.

 

It’s very typical that you put, you set up a canonical and Google ignores it is like a classic thing, canonical. What is it? We talked about canonical last time. Canonical is a tag that you put to tell Google that if you have four very similar pages, one is the one that is important.

 

And you might ask yourself, and why would I need four pages that are really similar? For example, again, e commerce, you have a product that is blue, but then you have another product is brown. You have another product that is green, another product. So you want to make sure that one version the version without colors specified in the URL is the main one.

 

And the other thing, threes are canonicals. So that’s why you have three because the it’s important for the user to have four versions, because the user actually needs to make a decision. But for the crawler, the crawler doesn’t need to choose four, only needs to choose one and show one.

 

So that is when you have a canonical, classic, classic situation with SEOs, is to get canonicals and a reader is confused, so understanding the difference of each of them is going to make your life way easier when you need to find issues.

 

Another thing that Google does a lot is, as I said, is confusing. Canonical. So if you have three products, as I said, you have this, the Adidas trainers, Adidas trainers, with flowers, trainers, whites, and that’s training with Pokemon. I don’t know I like Pokemon.

 

You’re going to hear me a lot using pokemons as an example, and suddenly you’re putting all these three in the canonical in here, but you have another Pokemon products, and you have decided to choose very similar titles, very similar description.

 

So Google might decide, hey, you’re saying that this is the main one, but from the keywords that you’re using, this one looks more alike. So I’m going to decide to ignore your canonical and choose my own canonical and then put this one as canonical here. So then you you know that there’s something that you need to change, either in the contents or in the links to clarify with Google.

 

A Google, you are getting it wrong and getting it right, but I’m not, I’m not clarifying it for you very much, so I’m going to change the content redirects again, is when a you have one piece of content that is no longer relevant or no longer important, and you have created another one that is more relevant for your users, and you want the users to see only one so you don’t want the users to see three other versions.

 

You want only one to be fundable by Google, and that applies when your product is no longer. You are not selling one product anymore. So you want your user to get to the new product that you have created. You have an website that the user is a website that you don’t use anymore because you have created a new one.

 

You want the user to create the new one. So it’s difficult sometimes to decide what you want to do, because most of people just go crazy and start putting redirects into everything. I say before putting redirects into everything, ask yourself these questions, is it user support to fund this URL or not? Can we risk indexing instability? Do we have duplicate contents? Are both pages needed? Yes, no.

 

 

How many internal links does hh URL have? And External links is also very important, and how much weight does this URL have over the other so am I actually the indexing one that is it has 15 links over one that has zero do I need to ask myself this question if, first, if you are not you don’t know what to do.

 

The classic is ask me, or anybody, anybody else, what is the more important for you? But you need to think, don’t take this decision lightly. Any questions on this.

 

No, okay, so I these are canonical is a signal. Redirect is a directive. No index is another directive, and no index is something that you put in the code to tell Google not to crawl one page because you don’t want that page to be found. Or Classic is that it’s been in the index by error.

 

So it’s very important that we look into the meta robots to understand what we have told Google to do. So there’s two ways that you can make the robot not to come to your contents, through the robot txt, because you’re telling No, no, no, no, don’t crawl here. Or through the emit attack. And you’re telling you, telling Google, no, no, no, don’t go here. You have forbidden it from the top of of the codes.

 

It’s super important that we look into what’s happening here, because it’s a classy situation where you have like here illumine.com resource article. So actually, a very important part of the content hub that is being no index blocking automatically the Google to continue crawling the all the articles.

 

So basically, after article, there were 1000 of articles that, because none of them had internal linking, and because no this page have been not indexed, Google was not going to and they wonder, why is my content plan that I have been writing for a year not yielding any results? That’s why. So that is very important, that we also control what’s indexed and not is not indexed.

carmen domínguez rodríguez 44:17

Any questions? Yes,

JaQueen McNair 44:28

hi, thanks. Sorry. I mean, look, you see the question. I know how to make a site map, but I don’t know how to put a sitemap into a robot TXT file, wherever that is like, especially if I don’t have a CMS that automatically generates one or supports one. And so with the internship that I am at, they use something called All In 1s. SEO,

 

carmen domínguez rodríguez 45:01

oh, like one SEO should. It doesn’t allow you to to put site maps in the Robot txt?

JaQueen McNair 45:08

I don’t know if I mean, I’m probably not doing it right, but it just doesn’t seem that when I go and check on it after the fact that these websites that are showing up are being indexed after the fact.

carmen domínguez rodríguez 45:26

There are many times where you need to actually get a developer to update the robot txt, because it will depend on how it was created, independently of the CMS. There will be cases like Shopify where you cannot edit it, like there’s no way that you can edit it because Shopify doesn’t allow it. They with the WordPress, you can edit it.

 

But if the developer originally, originally decided that only a developer can make changes and blocks, it was the name plugins to make changes to it, you might as well have yours or all on one SEO, it won’t change it. So you need to get hold of the developer to try to do it. If not, you can do the classic thing, which is putting the HTML Sitemap at the bottom of the page. I don’t like it.

 

I think it’s a very old fashioned way of doing SEO. But if you don’t have any other way to do it, potentially you could. But the ideal is that you get the robots txt, the site map included in the robot txt to a developer. Is the reason why developers block people making changes in the robot txt is because it can be really dangerous. So you might, for example, remove a disallow and suddenly allow everyone to see accounts of users. So imagine that you might be giving credit cards, details, address a lot of information.

 

So I understand why Robot txt are blocked from people accessing them. What I recommend here is just try to make friends with your developer. If you make friends with your developer, you can do magic. Trust me. Any other questions? Cool.

 

We have 10 minutes to for me to give you the task that we’re going to as last time, I’ll follow up with the the recording, well, the recording that I didn’t do last time, but I’ll follow up with the recording, the presentation, the templates and a brief for the task, and then the following week, the day before.

 

I will also let you know what we’re covering, but I’ll tell you now we’re going to be covering have all the tasks, so you guys need to do it, because we’re going to be to be going through it, and we’re going to be talking about Google Search Console, more in the in the in detail, beyond just technical errors. Cool.

 

stephanie inabo 47:48

Thank you.

carmen domínguez rodríguez 47:49

Thank you. Have a lovely weekend, guys bye.