Keyword & keyphrase extraction service to release value from unstructured text-based data

Rapid insight into dark and unstructured data on-disk. Search results for smart information nuggets - illuminate documents crucial to project success – minimal effort and no wasted time


Automated Keyword & Keyphrase Extractor

Robust, intelligent and non-destructive service to extract keywords and keyphrases from collections of unstructured and dark data.


Batch Mode

Processes complete directory trees (incl .zip archives) without disturbing integrity.


Run multiple algorithms sequentially

Powered by a choice of multiple keywording algorithms. Extensible to include additional methods as needed.


No pre-conditioning: will work on a 'bag of files'

Great for rapidly retrieving meaningful information from a 'bag of files' where file location or existence is unknown and file naming cannot be relied upon.


Returns simple text and .csv data as results

Output is configurable using the most simple vectors .csv and .txt; universally accepted as import formats.


Better than windows search (or similar)?

Windows search is blind to the importance of a word/term in a document. In contrast, Keywise results, because they are algorithmically important to the document, provide a more useful and meaningful output for the user.


Creates file listings

Capture the location and presence of specific file types in a ‘bag of files’. Identify empty or low value files and corrupt .zip archives.


Identifies all .pdf files with non-extractable text

Potential candidates for Optical Character Recognition (OCR) output to .csv.


simple and pragmatic

Keywise delivers a much better cost/value ratio for the client than more labour intensive, full metadata indexing. Entire catalogues of files can be processed in a single operation.

Keywise Illustrative Example

Here is an example of the results from applying Keywise to a public domain technical publication. The aim is to see if the Keywise outputs help clarify what information is included in that file. The document in question, in pdf format, has the file name "895786.pdf", which, everyone would agree, reveals little regarding the content of the file.

The table below shows results for file 895786.pdf (real*) using three of the numerous extraction methods available in Keywise. Useful terms are in green and comprise 68-92% of returned results. Who would have known that an unimaginatively named file called 895786.pdf buried ten directories deep in an obscure folder (or even a zip archive) even existed, and that it reveals that the Chicxulub impact event created limestone impact breccias that can act as hydrocarbon reservoirs? Just what you might need for your next project.

Keywise… insight delivered!

Method 1 Method 2 Method 3
76+% useful returns 92% useful returns 68-74% useful returns, high uniqueness
breccias chicxulub impact breccias analytical modeling
cardenas field couette general flow balance mass equations
caribbean petroleum engineering debris flow sediment breccias
chicxulub impact debris flow sediments clasts
clasts different flow different analytic models
couette flow discrete fracture network flow system discontinuity
debris flow dynamic flow process eighth largest oil
discontinuities equation flow fault breccia
embedded clasts flow analytical model fluid flow
engineering flow behavior fluid velocity
equation flow field fractures
fault flow velocities grains behavior throughmomentum
fault breccias flow velocity distribution hagen poiseuille tube
field fluid flow impact breccias
fluid flow fluid flow characteristics impact crater chicxulub
fluid flow field fluid flow field limestone reservoirs
fluid velocity fluid flow theory corresponds low hydraulic permeabil-
fractures fracture flow problem many impact breccias
geological high flow values mexico
impact breccias high velocity flow nonplanar discontinuities
impact fluid flow high volumetric flow numerical groundwater flow
limestone huge flow area petroleum engineering
limestone reservoirs impact breccia clasts planar discontinuities
mexico impact breccia flow poza rica field
oil impact breccia oil reservoir oil flow
oil flow impact breccias clasts rocks hydraulic permeability
permeability impact breccias contain sedimentary breccias
petroleum impact melt breccias specific analytical models
petroleum engineering limestone reservoirs show fault breccias tectonic fracture networks
petroleum engineering conference low flow velocity tectonic fractures
porosity many impact breccias tectonic fractures concepts
reservoir mean flow velocity vuggy porous medium
sedimentary newtonian fluid flow vugs
sedimentary breccias numerical groundwater flow model world oil reservoirs
tectonic oil flow 𝑟
tectonic fractures other impact breccias 𝑢
velocity parallel flow lines 𝜃
vugs present low fluid velocity
figure reservoir oil flow creative commons attribution
flow section flow area eje central lázaro
fluid sedimentary breccias store fluid facultad de ingenieŕıa
hindawi publishing sink flow galvis et al
hindawi publishing corporation steady flow pressure distri- galvis,1 pedro villaseñor,1
impact system flow hindawi publishing corporation
international journal tive representative flow characteristics instituto mexicano del
journal ulub impact breccias investigación y posgrado
journal of petroleum d. a. kring et al . m
matrix j. a. swaffield et al . mckeown et al
publishing corporation j. c. sabathier et al . open access article
rock j. g. hernández et al . rock

*Link to paper used

Contact us to arrange a free example output of your content

Call us today on +44 (0) 1684 540091 or Click here to send us an Email

at the same time you could also learn more about how we can help you FORM your data for success!


Alpha Petroleum Resources Limited (formally ATP Oil & Gas) has used Merlin Datawise as its data storage repository for a number of years and has found the services provided by Datawise to be not only very efficient but most importantly cost effective too

The services have been customised to meet Alpha's needs and the company provides excellent turnaround for its digital and hardcopy data, copying and distribution. The expert team at Merlin understand all the data types and always ask the right questions to ensure first rate data entry and excellent search and retrieval. Useful review meetings are provided to make sure data that the storage needs of Alpha Petroleum are being fully met.

Alan Bird, Data Manager Alpha Petroleum Resources Limited

Spectrum Data Management has always found the staff at Merlin Datawise to be very pleasant, helpful, and knowledgeable about archiving seismic data. When performing our annual offsite storage audit of our data at Merlin Spectrum Data Management staff has always received a warm welcome and valuable feedback of our archive process. For this audit process Merlin provided British Standards Institute recommendations for digital media preservation and detailed records of compliance with these standards. Their historical knowledge of seismic data surveys, thorough inventory, and willingness to share their experience has aided in Spectrum's development of Seismic Data Inventory Policies and Procedures.

Kathy Burris, Multi-Client Data Manager USA Spectrum Geo Inc.

One of the most important Merlin's strength is communication even in complex topics. Over the past few years I was dealing with few service companies and with Merlin I know that all will be done right because of their quality and responsiveness.

Anna Kordek, Multi Client Data Manager Spectrum Geo Ltd

Summit Exploration and Production Ltd has been a customer of Merlin Datawise for ten years, and I have personally used them for longer than that when I worked for other companies. I have found the staff to be extremely knowledgeable about the data that they are storing. This makes information searches and retrievals efficient and rapid when requested.

We found their service to be much better than other offsite suppliers with the result that we have now consolidated all our offsite data storage with Merlin.

Richard Inwards, IT Manager Summit Exploration and Production Limited


Contact Us

By Road: Under an hour from Birmingham and just 2 hours from London. For SatNav use WR13 6RN.

By Rail: On the London (Paddington) to Hereford rail line.

By Air: Birmingham International, London Heathrow, London Gatwick and Cardiff International all within easy reach.

Unit 3, Station Yard, Colwall
Malvern, Worcestershire, WR13 6RN
Phone: +44 (0) 1684 540091
Fax: +44 (0) 1684 541308

Business Hours:
Monday - Friday 9am to 5pm

Click here to Send an Email Send an Email

Have a project you think we could help with?

Get in touch