Title | Orchestrating the natural language processing software in the cloud computing environment |
Publication Type | Journal Article |
Year of Publication | 2013 |
Authors | Ustalov, D, Goldshtein, M |
Journal | Journal of Digital Information Management |
Volume | 11 |
Issue | 5 |
Pagination | 396 - 399 |
Date Published | 2013 |
Keywords | Cloud computing, Data-intensive computing, Distributed computing, Natural Language Processing, Service orchestration |
Abstract | The most of natural language processing problems are data-intensive. An important step in the distributed orchestration of natural language processing software is a rational choice of the specific middleware. The middleware should solve the presented problem with minimal deployment, support and usage costs. It is necessary to run and use that software in the distributed cloud computing environment to achieve such advantages such as consolidation, isolation, and efficient use of the existent infrastructure. It is often impossible to modify the existent natural language processing software to integrate it into the cloud computing environment because of licensing or organizational issues. This paper studies various popular distributed data processing tools and evaluates the selected natural language processing tools on a relatively large document collection in distributed way using the Gearman framework. The document collection is a 10'000 sentences from the Russian news subcorpus of the Leipzig corpora. The benchmarks are presented and discussed. |
URL | http://www.scopus.com/inward/record.url?eid=2-s2.0-84890443570&partnerID=40&md5=80168707d25eb1e8a3acabfb35fdf93f |