When a crawler attempts to get https://example.com/sitemap.xml, it'll receive a sitemap index. All of these indexes point to https://example.miraheze.org/sitemaps/example/sitemaps/sitemap-NS_0-0.xml.gz. This sitemap then points to various URLs on example.miraheze.org, *not* the custom domain. This means that sitemaps on custom domains are veritably nonexistent and no actual sitemap-assisted indexing of pages on the custom domain actually gets done.
Description
Event Timeline
Hi,
Thank you for creating a task on Phorge, we will endeavor to resolve it as soon as possible.
If you notice that your task has not received a response or follow up in a reasonable amount of time please comment on it.
Thanks,
Miraheze Technology Team
This wiki also uses the default Miraheze favicon, but only on the subdomain.
Edit: this is now fixed
Weird...
> curl --no-progress-meter https://chinafake.wiki/sitemap.xml | head <?xml version="1.0" encoding="UTF-8"?> <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <sitemap> <loc>https://chinafake.miraheze.org/sitemaps/chinafakewiki/sitemaps/sitemap-chinafakewiki-NS_0-0.xml.gz</loc> <lastmod>2024-08-24T00:31:25Z</lastmod> </sitemap> <sitemap> <loc>https://chinafake.miraheze.org/sitemaps/chinafakewiki/sitemaps/sitemap-chinafakewiki-NS_1-0.xml.gz</loc> <lastmod>2024-08-24T00:31:25Z</lastmod> </sitemap> > curl --no-progress-meter -L https://chinafake.miraheze.org/sitemaps/chinafakewiki/sitemaps/sitemap-chinafakewiki-NS_0-0.xml.gz | gunzip | head <?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://chinafake.miraheze.org/wiki/!</loc> <lastmod>2024-08-21T02:37:58Z</lastmod> <priority>1.0</priority> </url> <url> <loc>https://chinafake.miraheze.org/wiki/%22HAPPY%22_Animated_Jumprope_Characters</loc> <lastmod>2024-07-19T05:56:49Z</lastmod>
It appears that sitemaps are automatically regenerated every Saturday. It's Saturday right now, but they don't seem to be updated yet--try waiting a few hours (or perhaps even a day)?
It's now Sunday and the sitemap hasn't changed. Is it possible for someone to regenerate it manually?
Yep, someone with server access will need to run generateMirahezeSitemap.php (from extensions/MirahezeMagic).
Nope, it's gone again…
Could it be potentially prioritizing the subdomain over the chinafake.wiki domain when generating the sitemaps?
> curl https://chinafake.wiki/sitemap.xml?$RANDOM | head % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 3860 100 3860 0 0 4455 0 --:--:-- --:--:-- --:--:-- 4452 <?xml version="1.0" encoding="UTF-8"?> <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <sitemap> <loc>https://chinafake.miraheze.org/sitemaps/chinafakewiki/sitemaps/sitemap-chinafakewiki-NS_0-0.xml.gz</loc> <lastmod>2024-08-24T00:31:25Z</lastmod> </sitemap> <sitemap> <loc>https://chinafake.miraheze.org/sitemaps/chinafakewiki/sitemaps/sitemap-chinafakewiki-NS_1-0.xml.gz</loc> <lastmod>2024-08-24T00:31:25Z</lastmod> </sitemap> > curl https://chinafake.miraheze.org/sitemaps/chinafakewiki/sitemaps/sitemap-chinafakewiki-NS_0-0.xml.gz -L | gunzip | head % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 162 0 162 0 0 523 0 --:--:-- --:--:-- --:--:-- 524 100 24800 100 24800 0 0 43867 0 --:--:-- --:--:-- --:--:-- 0 <?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://chinafake.wiki/wiki/!</loc> <lastmod>2024-08-30T04:19:40Z</lastmod> <priority>1.0</priority> </url> <url> <loc>https://chinafake.wiki/wiki/%22HAPPY%22_Animated_Jumprope_Characters</loc> <lastmod>2024-07-19T05:56:49Z</lastmod>
I swear...
The sitemap is generating properly its an issue with updating the old sitemap with the new updated version. But the script I confirmed to be generating the correct version of the sitemap but then doesn't update it.
Would it be possible to use the sitemap WikiSEO generates instead of the one MH generates?
Seems the sitemap has updated properly. https://chinafake.wiki/sitemap.xml now says <lastmod>2024-09-03T21:01:16Z</lastmod>.
Got it, thanks! Looks to be showing up all good on the global sitemap at https://static.miraheze.org/sitemap.xml as well so I'd say this is fixed!