Bug 36977: Google does not read sitemaps with the name sitemapNNNN.xml
We have experienced problems with Google indexing: it seems (controlled
in Google search console) that Google does not "like" sitemaps
with a name like sitemapNNNN.xml (does not read them). Changing the name
to sitemap_NNNN.xml miraculously resolves the issue: individual pieces of
sitemap are read as they are declared in sitemapindex.xml.
Test plan:
==========
1. Have your site configured to work with Google search console
(cf. https://support.google.com/webmasters/answer/
9008080).
2. Generate sitemap with:
misc/cronjobs/sitemap.pl --dir /var/lib/koha/<instance>/sitemap
(create the directory if necessary)
3. Check with you browser that the sitemap is generated:
http://<OPAC-url>/sitemapindex.xml
Check that the individual pieces are readeable:
http://<OPAC-url>/sitemap0001.xml
4. Go to the Google seach console > Sitemaps > Add a new sitemap
Enter sitemapindex.xml
5. Most probably Google won't read your sitemap chunks with the
warning: "Couldn't fetch".
6. Apply the patch. Repeat p. 2, 3, 4, with the difference of the
chunk name:
http://<OPAC-url>/sitemap_0001.xml
7. You should see Google reading your entire sitemap with the info:
"Success".
Signed-off-by: Michael Skarupianski <michael.skarupianski@gmail.com>
Signed-off-by: MichaĆ Kula <148193449+mkibp@users.noreply.github.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Katrin Fischer <katrin.fischer@bsz-bw.de>