Welcome!

Enterprise DevOps, Log Management and Analytics

Sematext Blog

Subscribe to Sematext Blog: eMailAlertsEmail Alerts
Get Sematext Blog via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Marketing and Sales

Blog Feed Post

Large Scale Log Analytics with Solr – Presentation Upvoting

If topics like log analytics and Solr are your thing then we may have a treat for you at the upcoming Lucene / Solr Revolution conference in Austin in October.  Two of Sematext’s engineers and Solr, Elasticsearch and ELK stack experts — Rafal Kuc and Radu Gheorghe — have proposed a talk called “Large Scale Log Analytics with Solr” and could use some upvoting from the community to get in on this year’s agenda.

To show your support for “Large Scale Log Analytics with Solr” just click here to vote.  Takes less than a minute!  Even if you don’t attend the conference, we’ll post the slides and video here on the blog…assuming it gets on the agenda.  Voting will close at 11:59pm EDT on Thursday, June 25th.

LR_2015

Talk Summary

This talk is about searching and analyzing time-based data at scale. Documents ranging from blog posts and social media to application logs and metrics generated by smart watches and other “smart” things share a similar pattern: timestamp among their fields, rarely changeable, deletion when they become obsolete.

Very often this kind of data is so large that it causes scaling and performance challenges. We’ll address precisely these challenges, which include:

  1. Properly designing collections architecture
  2. Indexing data fast and without documents waiting in queues for processing
  3. Being able to run queries that include time-based sorting and faceting on enormous amounts of indexed data without killing Solr
  4. …and many more

We’ll start with the indexing pipeline — where you do all your ETL. We’ll show you how to maximize throughput through various ETL tools, such Flume, Kafka, Logstash and rsyslog, and make them scale and send data to Solr.

On the Solr side, we’ll show all sorts of tricks to optimize indexing and searching: from tuning merge policies to slicing collections based on timestamp. While scaling out, we’ll show how to improve the performance/cost ratio.

Thanks for your support!


Filed under: Search Tagged: conference, solr, talk

Read the original blog entry...

More Stories By Sematext Blog

Sematext is a globally distributed organization that builds innovative Cloud and On Premises solutions for performance monitoring, alerting and anomaly detection (SPM), log management and analytics (Logsene), and search analytics (SSA). We also provide Search and Big Data consulting services and offer 24/7 production support for Solr and Elasticsearch.