Welcome,
Guest
|
|
|
hi
thansk a lot for your help i tested with the script uplaoded and i received this: (i just changed the PATH-To-ROOT info) 250 - LIN Server 256 - LIN Server 64bit Handle: /usr/bin/pdftotext -cfg /PATH-TO-ROOT/html/components/com_joboffer/libraries/ifile/adapter/helpers/binaries/xpdfrc/xpdfrc "/PATH-TO-ROOT/html/components/com_joboffer/uploads/55a3c29d2ad8f/CV Mylne PANOTIER Ingnieure QSE.pdfx" - 2>&1 User: www-data pdftotext version 0.18.4 Copyright 2005-2011 The Poppler Developers - poppler.freedesktop.org Copyright 1996-2004 Glyph & Cog, LLC Usage: pdftotext [options] [] -f : first page to convert -l : last page to convert -r : resolution, in DPI (default is 72) -x : x-coordinate of the crop area top left corner -y : y-coordinate of the crop area top left corner -W : width of crop area in pixels (default is 0) -H : height of crop area in pixels (default is 0) -layout : maintain original physical layout -raw : keep strings in content stream order -htmlmeta : generate a simple HTML file, including the meta information -enc : output text encoding name -listenc : list available encodings -eol : output end-of-line convention (unix, dos, or mac) -nopgbrk : don't insert page breaks between pages -bbox : output bounding box for each word and page size to html. Sets -htmlmeta -opw : owner password (for encrypted files) -upw : user password (for encrypted files) -q : don't print any messages or errors -v : print copyright and version info -h : print usage information -help : print usage information --help : print usage information -? : print usage information |
The administrator has disabled public write access.
|
|
Hi,
I think that the problem is the permission to executable "/usr/bin/pdftotext" from "www-data" user. You can try from command line this command (without "sudo") /usr/bin/pdftotext -cfg /PATH-TO-ROOT/html/components/com_joboffer/libraries/ifile/adapter/helpers/binaries/xpdfrc/xpdfrc "/PATH-TO-ROOT/html/components/com_joboffer/uploads/55a3c29d2ad8f/CV Mylne PANOTIER Ingnieure QSE.pdfx" You can try also with one document without "blank" in the name? |
If you like, if it was useful, consider a donation, Thanks
Se vuoi, se ti siamo stati utili, considera una donazione, Grazie Help us by voting our extensions on Joomla.org: JiFile JoomPhoto Mobile Easy Language
The administrator has disabled public write access.
|
|
hi
i verified that apache got access to execute pdfinfo or pdftotext and all seems to be ok. but when we tried to execute the commmand you sent, we noticed that there is no option '-cfg' for pdftotext (or pdfinfo). our version of XPDF is 3.0.3 pdfinfo and pdftotext are 0.18.4 |
The administrator has disabled public write access.
|
|
Hi
i just wanted to test if i change the directive for pdfinfo in your adapter Search_Lucene_Document_PDF.php file; i just removed the "-cfg" part if ($custom_pdftotext) { //$handle2 = popen($pathBinaryFile . " {$opw} -cfg {$configXpdf} -q \"{$this->getFilename()}\" {$outputStreming}", 'r'); $handle2 = popen($pathBinaryFile . " {$opw} -q \"{$this->getFilename()}\" {$outputStreming}", 'r'); } else { //$handle2 = popen($pathBinaryFile . "{$executableSO} {$opw} -cfg {$configXpdf} -q \"{$this->getFilename()}\" {$outputStreming}", 'r'); $handle2 = popen($pathBinaryFile . "{$executableSO} {$opw} -q \"{$this->getFilename()}\" {$outputStreming}", 'r'); } And all the PDF are indexed ! Did we fin the solution or not? |
Last Edit: 06 Oct 2015 17:50 by tesson.
The administrator has disabled public write access.
|
|
Yes,
this solution works, but you can't use the file configuration but you need configure XPDF from command line. |
If you like, if it was useful, consider a donation, Thanks
Se vuoi, se ti siamo stati utili, considera una donazione, Grazie Help us by voting our extensions on Joomla.org: JiFile JoomPhoto Mobile Easy Language
The administrator has disabled public write access.
|
|
Updated to Joomla 3.6.2 and I thought it will be some protected or PDF/A files. But I renamed one of the existed and indexed pdfs and wanted to reindex it after deleting the existed index of this file: empty body.
Nothing is changed, everything is green (only id3 and com_dotnet I don't use) but I can't index newer files |
The administrator has disabled public write access.
|
|
Ok, working again! After deinstall and reinstall (can't download with Goo Chrome while loggedin). The Paths in the manual are wrong. I think with updating Joomla the paths changed from libraries/ifile/... to administrator/components/com_jifile/libraries/ifile/... But why the pdftotext and the pdfinfo were green, without finding them?
Then utf-8 instead blank in the config and xpdfrc deleted the # |
The administrator has disabled public write access.
|
|
If you not change browser settings, you agree to it. Learn more