Ticket #1259 (closed defect: fixed)

Opened 17 months ago

Last modified 13 months ago

Google Scholar link returning incorrect result set

Reported by: npeterson Owned by: npeterson
Priority: unassigned Milestone: 0.9.3_rc3
Component: ambra Version: 0.9.3
Keywords: Cc:

Description

When I select the GS links (either the related articles or look for citations) for two of the articles published on Feb 2nd, the title of the article is not indicated - instead GS says:

Sorry, we didn't find any articles related to PLoS: 0 notes 0 comments.

see also enclosed screenshot.

 http://www.plosone.org/article/related/info%3Adoi%2F10.1371%2Fjournal.pone.0004310  http://www.plosone.org/article/related/info%3Adoi%2F10.1371%2Fjournal.pone.0004298

I looked at other articles in Feb - eg on Feb 3rd, and the links seemed to be working properly. Is there something wrong just with the articles published on Feb 2nd?

Change History

Changed 17 months ago by npeterson

eg this article

 http://plosmedicine-demo.plos.org/article/info%3Adoi%2F10.1371%2Fjournal.pmed.0020124

has no citations on GS when you click the link but I think that's because it is searching with the wrong search string ie it searches with

 http://dx.plos.org/10.1371/journal.pmed.0020124.

if you search with just the doi

doi:10.1371/journal.pmed.0020124 you get 353 citations

Andy thinks this is also causing a problem on the searching of google blogs

Changed 17 months ago by npeterson

Not sure if it's been already reported or considered, but I'm not convinced that the Google Scholar functionality is working as well as it could.

For example, for  http://plosgenetics-demo.plos.org/article/related/info%3Adoi%2F10.1371%2Fjournal.pgen.0030050 [] ( http://scholar.google.com/scholar?hl=en&lr=&cites=http%3A%2F%2Fdx.plos.org%2F10.1371%2Fjournal.pgen.0030050 [] ), the first result (based on DOI in the search term?) is a Genetics Soc America article.

However, if you pull in author and title within the Scholar search term, like Bio and Medicine, you seem to get the correct article at the top more often:  http://medicine.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pmed.0040125 [] ( http://scholar.google.com/scholar?q=author:M+author:Nomura+Polymorphisms,+Mutations,+and+Amplification+of+the+EGFR+Gene+in+Non-Small+Cell+Lung+Cancers [] )

A similar search mechanism (author + title, not doi) applied to the first article above only pulls up the relevant article ( http://scholar.google.com/scholar?q=author:M+author:Johnson-Schlitz+Multiple-Pathway+Analysis+of+Double-Strand+Break+Repair+Mutations+in+Drosophila [] )

Is this something to consider changing (to bring up what appear to be more accurate results)?

Changed 17 months ago by npeterson

  • version changed from 0.9.1_rc2-SNAPSHOT to 0.9.2

Changed 17 months ago by josowski

We might switch to title/author:

 http://scholar.google.com/scholar?hl=en&lr=&q=%22Modulation+of+the+%CE%B2-Catenin+Signaling+Pathway+by+the+Dishevelled-Associated+Protein+Hipk1%22+author%3A%22Sarah+H.+Louie%22&btnG=Search

BUT we would have no direct link to citations or related articles at this point. That is, the user would have to go to scholar by the above link and then click on citations or related. But this won't fix the issue with newly published articles.

The root of the issue here is the Google search index is out of date. To my knowledge there is no way to force google to crawl our sites more frequently, we just have to wait it out.

Changed 17 months ago by josowski

  • owner changed from dragisak to josowski

Changed 16 months ago by npeterson

  • version changed from 0.9.2 to 0.9.3
  • milestone changed from 0.9.3_rc1 to 0.9.3_rc2

Changed 16 months ago by josowski

  • status changed from new to verify
  • resolution set to fixed

I made this change in r7690

Changed 16 months ago by josowski

  • status changed from verify to reopened
  • resolution fixed deleted

Changed 16 months ago by josowski

  • owner changed from josowski to npeterson
  • status changed from reopened to verify
  • resolution set to fixed

Changed 16 months ago by josowski

  • status changed from verify to reopened
  • owner changed from npeterson to josowski
  • resolution fixed deleted

Sorry, got my trac items confused...

Changed 15 months ago by josowski

  • owner changed from josowski to npeterson
  • status changed from reopened to verify
  • resolution set to fixed

We're banking on this being fixed by the need google feed we're providing. It's going to take time to verify.

Changed 15 months ago by rcave

  • milestone changed from 0.9.3_rc2 to 0.9.3_rc3

Changed 13 months ago by npeterson

  • status changed from verify to closed

Some further refinements are needed to the Google Scholar feed in the next Sprint, but as far as I can tell this is working as it should be.

Note: See TracTickets for help on using tickets.