Table Of Content

Dikshant Shahi Apache Solr A Practical Approach to Enterprise Search Dikshant Shahi Any source code or other supplementary materials referenced by the author in this text is available to readers at www.apress.com . For detailed information about how to locate your book’s source code, go to www.apress.com/source-code/ . ISBN 978-1-4842-1071-0 e-ISBN 978-1-4842-1070-3 DOI 10.1007/978-1-4842-1070-3 © Apress 2015 Apache Solr: A Practical Approach to Enterprise Search Managing Director: Welmoed Spahr Acquisitions Editor: Celestin Suresh John Development Editor: Matthew Moodie Technical Reviewer: Shweta Gupta Editorial Board: Steve Anglin, Pramilla Balan, Louise Corrigan, James DeWolf, Jonathan Gennick, Robert Hutchinson, Celestin Suresh John, Michelle Lowman, James Markham, Susan McDermott, Matthew Moodie, Jeffrey Pepper, Douglas Pundick, Ben Renow-Clarke, Gwenan Spearing Coordinating Editor: Rita Fernando Copy Editor: Sharon Wilkey Compositor: SPi Global Indexer: SPi Global For information on translations, please e-mail [email protected], or visit www.apress.com/ . Apress and friends of ED books may be purchased in bulk for academic, corporate, or promotional use. eBook versions and licenses are also available for most titles. For more information, reference our Special Bulk Sales–eBook Licensing web page at www.apress.com/bulk-sales . This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image, we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800- SPRINGER, fax (201) 348-4505, e-mail [email protected], or visit www.springer.com. Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc. (SSBM Finance Inc.). SSBM Finance Inc. is a Delaware corporation. To my foster mother, Mrs. Pratima Singh, for educating me! Introduction This book is for developers who are building or planning to build an enterprise search engine using Apache Solr. Chapters 1 and 3 can be read by anyone who intends to learn the basics of information retrieval, search engines, and Apache Solr specifically. Chapter 2 kick-starts development with Solr and will prove to be a great resource for Solr newbies and administrators. All other chapters explore the Solr features and approaches for developing a practical and effective search engine. This book covers use cases and examples from various domains such as e- commerce, legal, medical, and music, which will help you understand the need for certain features and how to approach the solution. While discussing the features, the book generally provides a snapshot of the required configuration, the command (using curl) to execute the feature, and a code snippet as required. The book dives into implementation details and writing plug-ins for integrating custom features. What this book doesn’t cover is performance improvement in Solr and optimizing it for high-speed indexing. This book covers Solr features through release 5.3.1, which is the latest at the time of this writing. What This Book Covers Chapter 1 , Apache Solr: An Introduction, as the name states, starts with an introduction to Apache Solr and its ecosystem. It then discusses the features, reasons for Solr’s popularity, its building blocks, and other information that will give you a holistic view about Solr. It also introduces related technologies and compares it to other alternatives. Chapter 2 , Solr Setup and Administration, begins with Solr fundamentals and covers Solr setup, steps for indexing your first set of documents and searching them. It then describes the Solr administrative features and various management options. Chapter 3 , Information Retrieval, is dedicated to the concepts of information retrieval, content extraction, and text processing. Chapter 4 , Schema Design and Text Analysis, covers the schema design, text analysis, going schemaless, and managed schemas in Solr. It also describes common text-analysis techniques. Chapter 5 , Indexing Data, concentrates on the Solr indexing process by describing the indexing request flow, various indexing tools, supported document formats, and important update request processors. This is also the first chapter that provides the steps to write a Solr plug-in, a custom UpdateRequestProcessor in this case. Chapter 6 , Searching Data, describes the Solr searching process, various query types, important query parsers, supported request parameters, and steps for writing a custom SearchComponent. Chapter 7 , Searching Data: Part 2, continues the previous chapter and covers local parameters, result grouping, statistics, faceting, reranking queries, and joins. It also dives into the details of function queries for deducing a practical relevance ranking and steps for writing your own named function. Chapter 8 , Solr Scoring, explains the Solr scoring process, supported scoring models, the score computation, and steps for customizing similarity. Chapter 9 , Additional Features, explores Solr features including spell- checking, autosuggestion, document similarity, and sponsored search. Chapter 10 , Traditional Scaling and SolrCloud, covers the distributed architectures supported by Solr and steps for setting up SolrCloud, creating a collection, distributed indexing and searching, shard splitting and ZooKeeper. Chapter 11 , Semantic Search, introduces the concept of semantic search and covers the tools and techniques for integrating semantic capabilities in Solr. What You Need for This Book Apache Solr requires Java Runtine Environment (JRE) 1.7 or newer. The provided custom Java code is tested on Java Development Kit (JDK) 1.8 and requires Apache Maven. The last chapter requires downloading resources required by Apache OpenNLP and WordNet. Who This Book Is For This book expects you to have basic understanding of the Java programming language, which is essential if you want to execute the custom components. Acknowledgments My first vote of thanks goes to my daily dose of caffeine (without which this book would not have been possible), my sister for preparing it, and my wife for teaching me to prepare it myself. Thanks to my parents for their love! Thank you, Celestin, for providing me the opportunity to write this book; Rita for coordinating the whole process; and Shweta, Matthew, Sharon, and SPi Global for all their help to get this book to completion. My sincere thanks to everyone else from Apress for believing in me. I am deeply indebted to everyone whom I have worked with in my professional journey and everyone who has motivated me and helped me learn and improve, directly or indirectly. A special thanks to my colleagues at The Digital Group for providing the support, flexibility, and occasional work break to complete the book on time. I would also like to thank all the open source contributors, especially of Apache Lucene and Solr; without their great work, there would have been no need for this book. As someone has rightly said, it takes a village to create a book. In creating this book, there is a small village, Sandha, located in the land of Buddha, which I frequented for tranquility and serenity that helped me focus on writing this book. Thank you!

Apache Solr: A Practical Approach to Enterprise Search PDF

464 Pages·3.559 MB·English

by Shahi Dikshant.

#Computers #Networking: Internet

Checking for file health...

Save to my drive

Quick download

Download

Download Apache Solr: A Practical Approach to Enterprise Search PDF Free - Full Version

by Shahi Dikshant.| 464 pages| 3.559| English

Download Apache Solr: A Practical Approach to Enterprise Search by Shahi Dikshant. in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Apache Solr: A Practical Approach to Enterprise Search

No description available for this book.

Detailed Information

Author:	Shahi Dikshant.
ISBN:	1859948
Pages:	464
Language:	English
File Size:	3.559
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Apache Solr: A Practical Approach to Enterprise Search Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Apache Solr: A Practical Approach to Enterprise Search PDF?

Yes, on https://PDFdrive.to you can download Apache Solr: A Practical Approach to Enterprise Search by Shahi Dikshant. completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Apache Solr: A Practical Approach to Enterprise Search on my mobile device?

After downloading Apache Solr: A Practical Approach to Enterprise Search PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Apache Solr: A Practical Approach to Enterprise Search?

Yes, this is the complete PDF version of Apache Solr: A Practical Approach to Enterprise Search by Shahi Dikshant.. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Apache Solr: A Practical Approach to Enterprise Search PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.