photo

Billie Raven

shared this question
6 years ago

Employees Involved

photo

PIV Support

Admin

Statistics

15
Comments
455
Views

Relates to

Share

Tags

1
votes

Sitemap

OR 3.2.5

I want the Sitemap for my entire domain to include my OR site.

I've not found a 'create sitemap' capability.

I've created the sitemap for the domain but it skips my OR site.

I suspect that the robots file is preventing my realty site from being included.

Is that correct?

I'm afraid I don't understand why that would be desirable.

I searched the community but have not found that answer yet.

Thanks in advance.

Comments (15)

photo Employee
1

The robots .txt file included with OR only excludes the administration area, i.e. any URLs that contain http://yoursite.com/admin/ it otherwise literally instructs that everything else be spidered.

  1. User-agent: *
  2. Disallow: /admin/
  3. Allow: /
if your robots.txt file does not look like that, it's not the one that came with OR v3

Otherwise to generate a sitemaps.org compliant site map for google et al, See docs:

http://docs.transparent-tech.com/open-realty/latest/generatexmlsitemap.html

photo
1

I am, of course, using the sitemap that came with OR v 3.2.5 which is exactly as you stated above.

When I create a sitemap for the domain (utahlandsale.com) it skips the directory (/listing) where OR is installed.

If I figure out how to trigger manually a sitemap for OR itself, will that be an additional sitemap within (/listing) not included in (utahlandsale.com/sitemap)?

Thanks

photo Employee
1

When I create a sitemap for the domain (utahlandsale.com) it skips the directory (/listing) where OR is installed.

Without disclosing exactly how you are generating a "sitemap" it's not really possible to speculate too much. if you are in fact using OR's sitemap generator, it cannot exclude /listing/ because that is OR's root folder, and where OR generates the sitemap. You will however probably want to edit OR's robots.txt file and change the Disallow: /admin to /listing/admin/ to reflect your actual OR administration area location if you don't want the spiders wasting their time trying to index password protected pages.

If you have OR installed in a /listing/ folder, whatever robots.txt file exists in the root folder above /listing/ is going to prempt OR's robots.txt file. OR cannot do or manage anything outside of the /listing/ folder or its sub-folders.

http://www.robotstxt.org/robotstxt.html

photo
1

Thank you, as always.

I think I'll attempt to manually add the (/listing) directory to my sitemap.

Or I'll remove the robotstxt from (/listing) and create a new sitemap at the root, see if it will include it that way.

Wouldn't you want your real estate site included in your domain's sitemap? Sorry if I'm missing the reason why not.

I used several different online sitemap builders including the very elaborate one at (auditmypc).

photo Employee
1

Wouldn't you want your real estate site included in your domain's sitemap?

That's the point, it should be included, but you don't have OR setup in your root folder, so whatever you have setup there is preventing your /listing/folder from being spidered.

OR and its robots.txt file can only affect what goes on after the spider has already entered the /listing/ folder, you have something setup above the /listing/ folder that is preventing that.

photo
1

Actually, I have NO robotstxt set up at the root.

the sitemap flows thru all the subdirectories except /listing/ folder which is why I wondered if it wasn't the robotstxt within that folder that's making the difference.

photo Employee
1

If removing the robots.txt from the /listing/ folder where OR is located (as you said you would do before) did not solve your problem, you have something else unrelated to OR that is causing this.

OR itself has no way to instruct a spider whether or not to index anything.

photo
1

sorry, you misunderstood. I did not take robotstxt out of OR. I didn't know I had to or if it was a good idea to. That's what I was trying to find out here. But I will try that now, of course, and hopefully not have to bother you more with this issue.

thanks, as always.

photo Employee
1

In your 3rd post you said exactly:

"I'll remove the robotstxt from (/listing) and create a new sitemap at the root, see if it will include it that way. "

photo
1

frustrated with me, just a bit?

Well, anyway, removing it didn't change anything.

Am I the only person who's had this issue? I can't find this situation reported here.

Could it be the PERMISSIONS set on the directory itself? (755)

or the HTACCESS within that directory?

photo Employee
1

Am I the only person who's had this issue? I can't find this situation reported here.

Yes, you are, because it's not an OR problem. You need to speak with your host if you don't understand how robots.txt files and search engine spiders work. OR has no control over what a spider or other 3rd party "sitemap" program can index.

Could it be the PERMISSIONS set on the directory itself? (755)

or the HTACCESS within that directory?

Not usually if your site is otherwise working. A search engine spider is really no different than a regular web site visitor, so if you can see it in a browser, so can the spider unless it is instructed otherwise by a robots.txt file, or the pages are password protected.

photo
1

OK, thanks, & I absolutely will not ask for help here again.

There is no robotstxt file at the root or in the directory so if, as you say, it is not a problem with OR, it was confusing to me why it couldn't be accessed like any other directory

photo Employee
1

That's a question best asked of your host and/or whoever/whatever is spidering your site that is not picking up the /listing/ folder.

OR's built-in sitemap file generator (if you use it) most definitely will create a sitemap of all relevant OR pages and it will place the sitemap file(s) in the root of wherever OR is installed. In your case that folder is /listing/. Also, to be valid, the sitemap file(s) must remain where OR creates them as the links within are based on that location.

photo
1

just for closure from my Host:

  1. Unfortunately I can't guess at what the script is doing or why it's not looking

    at that directory. There is nothing on the system that would be blocking it; if

    it's able to check other directories then there's nothing special about the

    listing/ directory (on our end, at least) that would cause it to be blocked. I

    would suggest contacting the script author to see if they can provide more

    information about the problem. They would obviously have more experience and

    knowledge with their software than we could.

photo
1

I found the error (I think) or at least explanation of why I can't get a sitemap built from the root that will include OR (v3.2.5) within (/listing).

creating a sitemap of (utahlandsale.com/listing) from the root excludes that directory but the same online sitemap generator (not OR) finds this location:

  1. utahlandsale.com/listing/listing/... file names.
So I have inserted an extra (/listing) location somewhere.

probably here: do I have to rerun install?

  1. Base URL: http://www.utahlandsale.com/listing
  2. Base Path: /web/sites/pamulex/utahlandsale.com/listing

Comments have been locked on this page!