<dependency>
<groupId>com.groupdocs</groupId>
<artifactId>groupdocs-parser-cloud</artifactId>
<version>22.3</version>
</dependency>
compile(group: 'com.groupdocs', name: 'groupdocs-parser-cloud', version: '22.3')
<dependency org="com.groupdocs" name="groupdocs-parser-cloud" rev="22.3">
<artifact name="groupdocs-parser-cloud" ext="jar"/>
</dependency>
libraryDependencies += "com.groupdocs" % "groupdocs-parser-cloud" % "22.3"
Document Parser Java Cloud REST API
Product Page | Docs | Live Demos | Swagger UI | Code Samples | Blog | Free Support | Free Trial
GroupDocs.Parser Cloud SDK for Java helps you build cloud Document Parser Java Apps that work without installing any 3rd party software. It is a wrapper around GroupDocs.Parser Cloud REST APIs.
Cloud Document Parsing SDK Features
- Create user-defined data extraction templates to extract data from the cloud documents.
- Retrieve user-defined templates created for parsing cloud data.
- Supports various ways of extracting text from cloud hosted files:
- Extract text in simple form
- Extract text by keeping the formatting intact
- Extract text from the specific pages only by providing the page range.
- Extract images from files hosted on the cloud:
- Image extraction of all images from the whole cloud document
- Extraction of images from specific pages based on desired page range.
- Get a list of all supported file formats.
- Fetch useful information regarding cloud document, such as:
- Cloud document file extension
- Cloud document size in Bytes
- Cloud document page count
- Retrieve information about the items within a container, such as, a Zipped archive, PDF portfolio, etc.
- Built-in cloud storage API to work with files & folders on the cloud storage.
Supported Document Parsing File Formats
Microsoft Word®: DOC, DOT, DOCX, DOCM, DOTX, DOTM, TXT, RTF
OpenOffice Writer®: ODT, OTT
Microsoft Excel®: XLS, XLT, XLSX, XLSM, XLSB, XLTX, XLTM, CSV, XLA, XLAM
OpenOffice Calc®: ODS, OTS
Apple® iWork: NUMBERS
Microsoft PowerPoint®: PPT, PPS, POT, PPTX, PPTM, POTX, POTM, PPSX, PPSM
OpenOffice Impress®: ODP, OTP
Microsoft Outlook®: PST, OST, EML, MSG
Apple® Mail EMLX
Microsoft OneNote®: ONE
Markup: HTML, XHTML, MHTML, MD (Markdown), XML
eBooks: CHM, EPUB, FB2
Fixed Layout: PDF
Archives: ZIP
Requirements
Building the API client library requires:
- Java 1.7+
- Maven
Prerequisites
To use GroupDocs.Parser Cloud SDK for Java you need to register an account with GroupDocs Cloud and lookup/create Client ID and Client Secret at Cloud Dashboard. There is free quota available. For more details, see GroupDocs Cloud Pricing.
Install GroupDocs.Parser-Cloud from Maven
Add GroupDocs Cloud repository to your application pom.xml
<repository>
<id>repository.groupdocs.cloud</id>
<name>repository.groupdocs.cloud</name>
<url>https://repository.groupdocs.cloud/repo/</url>
</repository>
Install from source
To install the API client library to your local Maven repository, simply execute:
mvn clean install
To deploy it to a remote Maven repository instead, configure the settings of the repository and execute:
mvn clean deploy
Refer to the OSSRH Guide for more information.
Maven users
Add this dependency to your project’s POM:
<dependency>
<groupId>com.groupdocs</groupId>
<artifactId>groupdocs-parser-cloud</artifactId>
<version>22.3</version>
</dependency>
Others
At first generate the JAR by executing:
mvn clean package
Then manually install the following JARs:
target/groupdocs-parser-cloud-22.3.jar
target/lib/*.jar
Get Started
Please follow the Quick Start instructions.
Extract Text by a Page Number Range via Java Cloud SDK
/ For complete examples and data files, please go to https://github.com/groupdocs-parser-cloud/groupdocs-parser-cloud-java-samples
String MyAppKey = ""; // Get AppKey and AppSID from https://dashboard.groupdocs.cloud
String MyAppSid = ""; // Get AppKey and AppSID from https://dashboard.groupdocs.cloud
Configuration configuration = new Configuration(MyAppSid, MyAppKey);
ParseApi apiInstance = new ParseApi(configuration);
FileInfo fileInfo = new FileInfo();
fileInfo.setFilePath("pdf/four-pages.pdf");
TextOptions options = new TextOptions();
options.setStartPageNumber(1);
options.setCountPagesToExtract(1);
options.setFileInfo(fileInfo);
TextRequest request = new TextRequest(options);
TextResult response = apiInstance.text(request);
Authorization & Authentication
Authentication schemes defined for the API is as follows:
JWT
- Type: OAuth 2.0
- Flow: application
- Authorization URL: https://api.groupdocs.cloud/connect/token
- Token Lifetime: 1 day (Default)
Product Page | Docs | Live Demos | Swagger UI | Code Samples | Blog | Free Support | Free Trial
File | Classifier | Size |
---|---|---|
groupdocs-parser-cloud-22.3-javadoc.jar | javadoc | 1.01 MB |
groupdocs-parser-cloud-22.3-sources.jar | sources | 177.61 KB |
groupdocs-parser-cloud-22.3.jar | 258.95 KB | |
groupdocs-parser-cloud-22.3.pom | 2.85 KB |