Ticket #1269 (closed defect: fixed)

Opened 11 months ago

Last modified 9 months ago

publishing articles is slow

Reported by: russ Owned by: npeterson
Priority: critical Milestone: 0.9.3_rc2
Component: ambra Version: 0.9.2
Keywords: Cc:

Description

publishArchives.action became slow in version 0.9.1 (or maybe 0.9.0?)

up until now we've lumped this problem in with the site slowness during ingest. but i think it's different.

publish should be very quick - we're just flipping a bit and invalidating some cache entries.

and when we publish, we get confirmation in ambra.log very quickly for each article, followed by a loooooong delay before the action completes.

sometimes, the action times out and articles are left in a semi-published state. this usually resolves on a retry.

more info on the mailing list:  http://lists.topazproject.org/pipermail/ambra-dev/2009-April/001714.html

Change History

  Changed 11 months ago by npeterson

  • priority changed from high to critical
  • owner changed from npeterson to wtoconnor

  Changed 11 months ago by npeterson

pubbing today's CB articles gave me the following:

Errors

Exception: org.topazproject.ambra.ApplicationException?: org.topazproject.ambra.ApplicationException?: org.topazproject.otm.query.QueryException?: error performing query 'select id from Article o where id := cast(o, Article).id and id = ;'

Messages

Published: info:doi/10.1371/image.pcbi.v05.i04 Published: info:doi/10.1371/journal.pcbi.1000358 Published: info:doi/10.1371/journal.pcbi.1000359 Published: info:doi/10.1371/journal.pcbi.1000362 Published: info:doi/10.1371/journal.pcbi.1000365 Published: info:doi/10.1371/journal.pcbi.1000367 Published: info:doi/10.1371/journal.pcbi.1000371 Published: info:doi/10.1371/journal.pcbi.1000374 Published: info:doi/10.1371/journal.pcbi.1000375

it tells me everything pubbed, but if i go to browse by pub date there is nothing for 4/24, and search turns nothing up. Going directly to an article gives a site error (though I can view it if I log in).

Same for Gen: " Exception: org.topazproject.ambra.ApplicationException?: org.topazproject.ambra.ApplicationException?: org.topazproject.otm.query.QueryException?: error performing query 'select id from Article o where id := cast(o, Article).id and id = ;'

Messages

Published: info:doi/10.1371/image.pgen.v05.i04 Published: info:doi/10.1371/journal.pgen.1000458 Published: info:doi/10.1371/journal.pgen.1000459 Published: info:doi/10.1371/journal.pgen.1000460 Published: info:doi/10.1371/journal.pgen.1000461 "

possibly relevant: the pub step is taking longer than usual today. 12 minutes to pub 5 Genetics articles.

  Changed 11 months ago by wtoconnor

Looks like the transaction timed out.

*************************************************************************************** Caused by: org.mulgara.query.MulgaraTransactionException?: No externally mediated transaction associated with session - last transaction was rolled back with error: transaction-timeout

at

**************************************************************************************

2009-04-23 15:04:32,011 ERROR BaseAdminActionSupport?(PLoSCompBiol)> Admin console Exception [TP-Processor1176 org.topazproject.ambra.admin.action.BaseAdminActionSupport?] org.topazproject.ambra.ApplicationException?: org.topazproject.ambra.ApplicationException?: org.topazproject.otm.query.QueryException?: error performing query 'select id from Article o where id := cast(o, Article).id and id = <info:doi/10.1371/journal.pcbi.1000365>;' Caused by: org.topazproject.ambra.ApplicationException?: org.topazproject.otm.query.QueryException?: error performing query 'select id from Article o where id:= cast(o, Article).id and id = <info:doi/10.1371/journal.pcbi.1000365>;' Caused by: org.topazproject.otm.query.QueryException?: error performing query 'select id from Article o where id := cast(o, Article).id and id = <info:doi/10.1371/journal.pcbi.1000365>;' Caused by: java.io.IOException: Error running query 'select $o from < local:///topazproject#filter:graph=ri> where $o <rdf:type> < http://rdf.topazproject.org/RDF/Article> in < local:///topazproject#filter:graph=ri> and $o <mulgara:is> <info:doi/10.1371/journal.pcbi.1000365> and $o < http://prismstandard.org/namespaces/1.2/basic/eIssn> '1935-2735' in < local:///topazproject#filter:graph=ri> and $o < http://prismstandard.org/namespaces/1.2/basic/eIssn> '1935-2735' in < local:///topazproject#filter:graph=ri>;' Caused by: org.mulgara.query.QueryException?: Query failed: No externally mediated transaction associated with session - last transaction was rolled back with error: transaction-timeout

full trace follows:

2009-04-23 15:04:32,011 ERROR BaseAdminActionSupport?(PLoSCompBiol)> Admin console Exception [TP-Processor1176 org.topazproject.ambra.admin.action.BaseAdminActionSupport?] org.topazproject.ambra.ApplicationException?: org.topazproject.ambra.ApplicationException?: org.topazproject.otm.query.QueryException?: error performing query 'select id from Article o where id := cast(o, Article).id and id = <info:doi/10.1371/journal.pcbi.1000365>;'

at

org.topazproject.ambra.admin.service.DocumentManagementService?.getPublis hableFiles(DocumentManagementService?.java:352)

at

org.topazproject.ambra.admin.service.DocumentManagementService?$$FastClas? sByCGLIB$$bf95593d.invoke(<generated>)

at net.sf.cglib.proxy.MethodProxy?.invoke(MethodProxy?.java:191)

org.topazproject.ambra.admin.service.DocumentManagementService?$$Enhancer ByCGLIB$$91a173cb.getPublishableFiles(<generated>)

at

org.topazproject.ambra.admin.action.BaseAdminActionSupport?.base(BaseAdmi? nActionSupport.java:46)

at

org.topazproject.ambra.admin.action.PublishArchivesAction?.execute(Publis hArchivesAction.java:69)

at

org.topazproject.ambra.admin.action.PublishArchivesAction?$$FastClassByCG LIB$$cc8b2cff.invoke(<generated>)

at net.sf.cglib.proxy.MethodProxy?.invoke(MethodProxy?.java:191)

org.topazproject.ambra.admin.action.PublishArchivesAction?$$EnhancerByCGL IB$$4d7b17e5.execute(<generated>)

at sun.reflect.NativeMethodAccessorImpl?.invoke0(Native Method)

org.topazproject.ambra.user.EnsureRoleInterceptor?.intercept(EnsureRoleIn? terceptor.java:43)

org.topazproject.ambra.user.EnsureUserAccountInterceptor?.getReturnCodeDe pendingOnDisplayName(EnsureUserAccountInterceptor?.java:101)

at

org.topazproject.ambra.user.EnsureUserAccountInterceptor?.intercept(Ensur eUserAccountInterceptor.java:73)

org.topazproject.ambra.user.UserAccountsInterceptor?.intercept(UserAccoun? tsInterceptor.java:102)

org.topazproject.ambra.journal.SessionFiltersInterceptor?.intercept(Sessi onFiltersInterceptor.java:51)

org.topazproject.ambra.web.DummySSOFilter.doFilter(DummySSOFilter.java:1 05)

Caused by: org.topazproject.ambra.ApplicationException?: org.topazproject.otm.query.QueryException?: error performing query 'select id from Article o where id := cast(o, Article).id and id = <info:doi/10.1371/journal.pcbi.1000365>;'

at

org.topazproject.ambra.article.service.FetchArticleService?.getArticleIds (FetchArticleService?.java:235)

at

org.topazproject.ambra.admin.service.DocumentManagementService?.getPublis hableFiles(DocumentManagementService?.java:347)

... 168 more

Caused by: org.topazproject.otm.query.QueryException?: error performing query 'select id from Article o where id := cast(o, Article).id and id = <info:doi/10.1371/journal.pcbi.1000365>;'

at

org.topazproject.otm.stores.ItqlStore?.doQuery(ItqlStore?.java:851)

at

org.topazproject.otm.impl.QueryImpl?.execute(QueryImpl?.java:86)

at

org.topazproject.ambra.journal.JournalCarrierService?.getObjects(JournalC arrierService.java:169)

at

org.topazproject.ambra.journal.JournalCarrierService?.buildCarrierMap(Jou rnalCarrierService.java:147)

at

org.topazproject.ambra.journal.JournalCarrierService?.access$200(JournalC arrierService.java:56)

at

org.topazproject.ambra.journal.JournalCarrierService?$2.lookup(JournalCar? rierService.java:187)

at

org.topazproject.ambra.journal.JournalCarrierService?$2.lookup(JournalCar? rierService.java:186)

at

org.topazproject.ambra.cache.EhcacheProvider?$1.execute(EhcacheProvider?.j ava:123)

at

org.topazproject.ambra.cache.Cache$SynchronizedLookup?.execute(Cache.java :220)

at

org.topazproject.ambra.cache.EhcacheProvider?.get(EhcacheProvider?.java:11 3)

at

org.topazproject.ambra.journal.JournalCarrierService?.getJournalKeysForOb ject(JournalCarrierService?.java:184)

at

org.topazproject.ambra.journal.JournalCarrierService?.getJournalsForObjec t(JournalCarrierService?.java:198)

at

org.topazproject.ambra.journal.JournalService?.getJournalsForObject(Journ alService.java:163)

at

org.topazproject.ambra.journal.JournalService?$$FastClassByCGLIB$$3e152a5 7.invoke(<generated>)

at net.sf.cglib.proxy.MethodProxy?.invoke(MethodProxy?.java:191)

org.topazproject.ambra.journal.JournalService?$$EnhancerByCGLIB$$dc36c2ad .getJournalsForObject(<generated>)

at

org.topazproject.ambra.article.service.ArticleFeedService?$Invalidator.ma tchesJournal(ArticleFeedService?.java:638)

at

org.topazproject.ambra.article.service.ArticleFeedService?$Invalidator.ma tches(ArticleFeedService?.java:532)

at

org.topazproject.ambra.article.service.ArticleFeedService?$Invalidator.in validateFeedCacheForArticle(ArticleFeedService?.java:478)

at

org.topazproject.ambra.article.service.ArticleFeedService?$Invalidator.ob jectChanged(ArticleFeedService?.java:425)

at

org.topazproject.ambra.cache.CacheManager?.objectChanged(CacheManager?.jav a:163)

at

org.topazproject.ambra.cache.OtmInterceptor?.onPostWrite(OtmInterceptor?.j ava:232)

at

org.topazproject.otm.impl.SessionImpl?.write(SessionImpl?.java:584)

at

org.topazproject.otm.impl.SessionImpl?.doFlush(SessionImpl?.java:167)

at

org.topazproject.otm.impl.SessionImpl?.flush(SessionImpl?.java:131)

at

org.topazproject.otm.impl.QueryImpl?.execute(QueryImpl?.java:81)

at

org.topazproject.ambra.article.service.ArticleOtmService?.getArticleIds(A rticleOtmService.java:246)

at

org.topazproject.ambra.article.service.ArticleOtmService?$$FastClassByCGL IB$$cdd3990.invoke(<generated>)

at net.sf.cglib.proxy.MethodProxy?.invoke(MethodProxy?.java:191)

org.topazproject.ambra.article.service.ArticleOtmService?$$EnhancerByCGLI B$$8eb5fd20.getArticleIds(<generated>)

at

org.topazproject.ambra.article.service.FetchArticleService?.getArticleIds (FetchArticleService?.java:233)

... 169 more

Caused by: java.io.IOException: Error running query 'select $o from < local:///topazproject#filter:graph=ri> where $o <rdf:type> < http://rdf.topazproject.org/RDF/Article> in < local:///topazproject#filter:graph=ri> and $o <mulgara:is> <info:doi/10.1371/journal.pcbi.1000365> and $o < http://prismstandard.org/namespaces/1.2/basic/eIssn> '1935-2735' in < local:///topazproject#filter:graph=ri> and $o < http://prismstandard.org/namespaces/1.2/basic/eIssn> '1935-2735' in < local:///topazproject#filter:graph=ri>;'

at

org.topazproject.mulgara.itql.TIClient.doQuery(TIClient.java:101)

at

org.topazproject.otm.stores.ItqlStore?.doQuery(ItqlStore?.java:849)

... 209 more

Caused by: org.mulgara.query.QueryException?: Query failed: No externally mediated transaction associated with session - last transaction was rolled back with error: transaction-timeout

at

org.mulgara.resolver.DatabaseSession?.execute(DatabaseSession?.java:755)

at

org.mulgara.resolver.DatabaseSession?.query(DatabaseSession?.java:465)

at

org.mulgara.server.rmi.SessionWrapperRemoteSession?.query(SessionWrapperR emoteSession.java:396)

at sun.reflect.GeneratedMethodAccessor?9.invoke(Unknown Source)

org.mulgara.server.rmi.RemoteSessionWrapperSession?.query(RemoteSessionWr? apperSession.java:541)

at org.mulgara.query.Query.execute(Query.java:568) at org.mulgara.query.Query.execute(Query.java:57) at

org.topazproject.mulgara.itql.TIClient.doQuery(TIClient.java:82)

... 210 more

*************************************************************************************** Caused by: org.mulgara.query.MulgaraTransactionException?: No externally mediated transaction associated with session - last transaction was rolled back with error: transaction-timeout

at

************************************************************************************** org.mulgara.resolver.MulgaraExternalTransactionFactory?.getTransaction(Mu lgaraExternalTransactionFactory.java:66)

at

org.mulgara.resolver.DatabaseSession?.execute(DatabaseSession?.java:751)

at

org.mulgara.resolver.DatabaseSession?.query(DatabaseSession?.java:465)

at

org.mulgara.server.rmi.SessionWrapperRemoteSession?.query(SessionWrapperR emoteSession.java:396)

at sun.reflect.GeneratedMethodAccessor?9.invoke(Unknown Source)

org.topazproject.ambra.admin.action.PublishArchivesAction?$$EnhancerByCGL IB$$4d7b17e5.execute(<generated>)

at sun.reflect.NativeMethodAccessorImpl?.invoke0(Native Method)

org.topazproject.ambra.user.EnsureUserAccountInterceptor?.getReturnCodeDe pendingOnDisplayName(EnsureUserAccountInterceptor?.java:101)

org.topazproject.ambra.user.EnsureUserAccountInterceptor?.intercept(Ensur eUserAccountInterceptor.java:73)

org.topazproject.ambra.user.UserAccountsInterceptor?.intercept(UserAccoun? tsInterceptor.java:102)

org.topazproject.ambra.journal.SessionFiltersInterceptor?.intercept(Sessi onFiltersInterceptor.java:51)

org.topazproject.ambra.cas.client.filter.CASFilter.doFilter(CASFilter.ja va:319)

org.topazproject.ambra.web.DummySSOFilter.doFilter(DummySSOFilter.java:1 05)

org.topazproject.ambra.web.MultipleRequestFilter?.doFilter(MultipleReques? tFilter.java:95)

follow-up: ↓ 5   Changed 11 months ago by rcave

Possibly due to Lucene re-index for newly published articles. Can Lucene re-index asynchronously?

in reply to: ↑ 4   Changed 11 months ago by ronald

Replying to rich:

Possibly due to Lucene re-index for newly published articles. Can Lucene re-index asynchronously?

First of all, stop guessing - it's quite easy to figure out where the time is being spent. Second, there's no indexing happening on publish.

  Changed 10 months ago by wtoconnor

After looking through the logs and doing more research into the incredibly complex piece of code that simply needs to find articles in the DISABLED state (not yet published).

1. By the time this error happens the articles are already published and everything 
is fine.

2. The PublishArchievesAction has a method called base(). One of base()'s jobs is 
to look up all the ready to publish articles and provide that information to the 
freemarker template so the admin will see them on the page.

3. base() invokes getPublishableFiles which looks for all the articles with  state
= Article.STATE_DISABLED. (seems simple enough)

4. During this lookup QueryImpl notices we have just written things (we published) 
and decides to flush.

5. After the write the onPostWrite method of the OtmIntercepter is invoked which 
means all the cache invalidators (note the plural) start getting called. This sets 
off a new round of queries by JournalCarrierService.getJournalKeysForObject.
  
private Set<URI> getObjects(String jName, URI obj, Session s) {
    Set<String> oldFilters = s.listFilters();
    for (String fn : oldFilters)
      s.disableFilter(fn);

    for (String fn : journalFilterService.getFilters(jName, s))
      s.enableFilter(fn);

    Set<URI> res = new HashSet<URI>();

    // XXX: should be "... from Object ..." but filters don't get applied then
    String q = "select id from Article o where id := cast(o, Article).id" +
                (obj != null ? " and id = <" + obj + ">;" : ";");

Which is part of the journal filtering.

6. Eventually one of these queries hangs for some reason and the transaction 
times out and everything gets rolled back even though it was probably published 
correctly in the first place.

7. In the case of this exception the Feed invalidator was getting called. I am 
not sure if it is always the one that causes the problem but we need to check.

Basically this one one action spawns hundreds of queries.


  Changed 10 months ago by npeterson

  • milestone changed from 0.9.3_rc1 to 0.9.3_rc2

  Changed 10 months ago by npeterson

  • owner changed from wtoconnor to dragisak
  • status changed from new to assigned

  Changed 9 months ago by npeterson

  • owner changed from dragisak to npeterson
  • status changed from assigned to verify
  • resolution set to fixed

  Changed 9 months ago by npeterson

  • status changed from verify to closed
Note: See TracTickets for help on using tickets.