Welcome!

Enterprise DevOps, Log Management and Analytics

Sematext Blog

Subscribe to Sematext Blog: eMailAlertsEmail Alerts
Get Sematext Blog via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Latest Blogs from Sematext Blog
By popular demand…we’ve just added some new goodness to SPM with the SPM REST API. This new API lets you: create SPM Apps for monitoring (e.g. generate a new SPM App + its token during deployment; list all available metrics and charts for a specific App; list all alerts defined for so...
Elasticsearch is booming. Together with Logstash, a tool for collecting and processing logs, and Kibana, a tool for searching and visualizing data in Elasticsearch (aka the “ELK” stack), adoption of Elasticsearch continues to grow by leaps and bounds. When it comes to actually using ...
Fresh from GeeCON in Krakow…we have another Elasticsearch and Logging manifesto from Sematext engineers — and book authors — Rafal Kuc and Radu Gheorghe.  As with many of their previous presentations, Radu and Rafal go into detail on Elasticsearch, Logstash and Rsyslo...
For those of you interested in some comprehensive Elasticsearch training taught by experts from Sematext who know it inside and out, we’re running an Elasticsearch Intro workshop in Berlin on Wednesday, June 3 (the day after Berlin Buzzwords ends). This full-day, hands-on training work...
For those of you using Cassandra or considering it, we’ve got another Sematext user’s Planet Cassandra case study for your reading pleasure. In this Q&A/use case, Alain Rodriguez, main Data Architect for Teads.tv, talks about the business and technical needs that drove their decision ...
If you’re working with Elasticsearch, it’s very likely that you’ll need to reindex data at some point. The most popular reason is because you need a mapping change that is incompatible with your current mapping. New fields can be added by default, but many changes are...
One of the trends we see in our Elasticsearch and Solr consulting work is that everyone is processing one kind of data stream of another.  Including us, actually – we process endless streams of metrics, continuous log and even streams, high volume clickstreams, etc.  Kafka is cle...
Did you know that Logsene provides a complete ELK Stack; i.e., a complete Log management, analytics, exploration, and visualization solution? Logsene currently supports Kibana 3 with complete Kibana 4 support about to be released soon. Can’t wait to use Kibana 4 with Logsene? No proble...
Do your SPM charts ever look choppy like this? Let’s hope not!  However, if you run large Elasticsearch clusters, especially those with thousands of shards, you may suffer from Elasticsearch sometimes taking a long time to provide information about its metrics that SPM, of course...
For those of you interested in some comprehensive Elasticsearch training taught by experts (and authors of several Elasticsearch books!) who know it inside and out, you are in luck if you are attending — or considering — the GeeCON conference taking place in Krakow from May...
We recently added support for Node.js and io.js monitoring to SPM and have received great feedback.  While SPM for Node.js monitors all key Node.js metrics, most applications have additional metrics one often wants to track — things like: the number of concurrent users, the numbe...
If you’re using rsyslog for processing lots of logs (and, as we’ve shown before, rsyslog is good at processing lots of logs), you’re probably interested in monitoring it. To do that, you can use impstats, which comes from input module for process stats. impstats produ...
Hot off the press: a brand new Solr Cookbook! One of Sematext’s Solr and Elasticsearch experts — and authors — Rafał Kuć, has just published the third and latest edition of Solr Cookbook. This edition covers both Solr 4.x (based on the newest 4.10.3 version of Solr) and the just-rele...
HBase is a popular open-source, non-relational (NoSQL), column-oriented, distributed database that runs on top of the Hadoop Distributed File System (HDFS).  HBase is well suited for sparse data sets, which are common in many big data use cases.  Fortunately for all its users, SPM now ...
The results for HBase version distribution poll are in.  Thanks to everyone who took the time to vote! The distribution pie chart is below, but we could summarize it as follows: A big chunk of HBase clusters, about 30%, are still “stuck” on HBase 0.94.x Over 37% of the HBas...
The results for Apache Kafka version distribution poll are in. Thanks to everyone who took the time to vote! The distribution pie chart is below, but we could summarize it as follows: It’s great to see Kafka users being so quick to migrate to the latest version of Kafka! We...
We are updating SPM for HBase to make sure SPM collects all the key HBase metrics that were added in 0.98, we thought it would be good to see which HBase versions are being used in the wild.  We’re on 0.98 after being on 0.94 for a long time.  How about you? Please tweet this pol...
Using Cloud (aka SaaS) applications is natural for most of us — simply sign up with your email, login and then use the service within minutes. The Cloud works particularly well with consumer-oriented services. Businesses, however, have slightly different needs. Up until now, Semate...
With Kafka 0.8.2 and 0.8.2.1 being released and with the updated SPM for Kafka monitoring over 100 Kafka metrics, we thought it would be good to see which Kafka versions are being used in the wild.  Kafka 0.7.x was a strong and stable release used by many.  The 0.8.1.x release has been...
Kafka 0.8.2 has a pile of new metrics for all three main Kafka components: Producers, Brokers, and Consumers. Not only does it have a lot of new metrics, the whole metrics part of Kafka has been redone — we worked closely with Kafka developers for several weeks to bring order and stru...
By default, Elasticsearch does a good job of figuring the type of data in each field of your logs. But if you like your logs structured like we do, you probably want more control over how they’re indexed: is time_elapsed an integer or a float? Do you want your tags analyzed so you can ...
New functionality is rolling out in SPM Performance Monitoring! Watch this space for future posts on Transaction Tracing, Global and App-specific Server Views, Kafka 0.8.2 monitoring and other cool stuff. For this post, those of you who use HAProxy are in luck as we just adde...
“Solr or Elasticsearch?”…well, at least that is the common question I hear from Sematext’s consulting services clients and prospects. Which one is better, Solr or Elasticsearch? Which one is faster? Which one scales better? Which one can do X, and Y, and Z? Which one is easier to ...
At Graphflow, our mission is to empower online stores of all sizes to grow their businesses by providing them access to the same machine learning and Big Data tools used by the largest and most sophisticated tech players in the market. To deliver on this mission, we decided from the...
Sematext is looking for a strong full-stack developers who: Find creative and elegant solutions, build tools, avoid repetition and boilerplate code Take ownership and push forward; want to help build the team and the organization Like working with data-intense applications, contin...
If you’re an avid Solr user you’ll want to check out these Lucene / Solr Revolution videos from two of Sematext’s Solr experts: Rafal Kuc and Radu Gheorghe. Radu talked about Solr performance tuning, which is always nice for keeping your applications snappy and your costs down. This i...
If you use Cassandra you will find some interesting insights in this Planet Cassandra case study by Sematext client Recruiting.com. Hitendra Pratap Singh, a Cassandra Software Engineer, talks about why they decided to deploy Cassandra, other NoSQL solutions they looked at, advice for ...
About 10 days ago we ran a a poll about which languages/APIs people use when writing their Apache Kafka Producers and Consumers. See Kafka Poll: Producer & Consumer Client. We collected 130 votes so far. The results were actually somewhat surprising! Let’s share the numbers f...
No, it’s not an endless loop waiting to happen, the plan here is to use Logstash to parse Elasticsearch logs and send them to another Elasticsearch cluster or to a log analytics service like Logsene (which conveniently exposes the Elasticsearch API, so you can use it without having to ...
Many Distributed DevOps Teams Rely on Slack, a platform for team communication providing everything in one place, instantly searchable and available wherever you go. SPM Performance Monitoring‘s new integration via WebHooks provides the capability to forward alerts to many services, i...
One of the great things about Logsene, our log management tool, is that you don't need to care about the back-end - you know, where you store your logs. You just pick a log shipper (here are Top 5 Log Shippers), point it to Logsene (here's How to Send Logs to Logsene) and you are done....
Kafka has become the de-facto standard for handling real-time streams in high-volume, data-intensive applications, and there are certainly a lot of those out there. We thought it would be valuable to conduct a quick poll to find out which which implementation of Kafka Producers and Co...
Using a performance monitoring system that you built yourself? You are not alone!  Many organizations monitor their applications and IT infrastructure with a bolted-together and often incompatible assortment of tools.  With larger organizations this can number to a dozen or more differ...
While many SPM Performance Monitoring users quickly see the benefits of SPM and adopt it in their organizations for monitoring — not just for Elasticsearch, but for their complete application stack — some Elasticsearch users evaluate SPM and compare it to Marvel from Elasticsearch. We...
The Log Shipper Poll results are in! We run Logsene here at Sematext, so we wanted to know what people like to use to ship their logs. Before we share the results, a few words about the poll: We published it here on our blog on September 22, 2014 We automatically tweeted it and p...
Many agile DevOps teams rely on communication via HipChat, which provides an API and mobile apps to receive messages while being away from one’s desktop. SPM Performance Monitoring‘s new integration via WebHooks provides the capability to forward alerts to many services, including Hip...
Apache Spark is an open-source, large-scale data processing engine built on top of the Hadoop Distributed File System (HDFS) and enables applications in Hadoop clusters to run up to 100x faster in memory, and 10x faster even when running on disk.  So it’s not surprising the usage of Sp...
If so, you are not alone! We talk to a lot of people who want to reduce the frequent “noise” from monitoring alarms. To solve this common problem, Sematext added anomaly detection for alerts and PagerDuty integration to its SPM Performance Monitoring solution to dramatically reduce t...
Thanks to everyone who stopped by the Sematext booth at last week’s Lucene/Solr Revolution event in Washington, DC and attended our two talks: Tuning Solr for Logs by Radu Gheorghe Solr Anti-Patterns by Rafal Kuc The attendance, questions and interest are very much appreciated.  As a c...
Going to Lucene/Solr Revolution next week — November 11-14 — in Washington, DC?  If so…Sematext will be there exhibiting AND giving two talks!  If you are going, stop by our table to say hello.  We can show you the latest versions of SPM Performance Monitoring, Logsen...