Log Aggregation with Grafana Loki
We recently deployed Loki to our infrastructure and in this blog post want to share the pitfalls and tips we discovered.
»We recently deployed Loki to our infrastructure and in this blog post want to share the pitfalls and tips we discovered.
»It’s been a while since we shared the story of an incident with you, and that’s probably a good thing – most operational incidents we had in the past year were “boring” enough in nature to fix them easily. This time, we’ve got a story of a data loss, caused by pure and simple human error – and the story of how we recovered the data.
»Last night, there was a disruption in the network of the data center our servers run in. We do not yet know the source or exact nature of the disruption, but it probably caused increased latencies and error rates in the connection of our application servers to our database servers. This alone would not be a problem in itself, but it caused a series of other problems that caused a full outage of our system starting approximately at 6:57 CET. I was alerted about the issue (and woken up) at 07:02 CET. At 7:04 CET, the system was already available again, but three of our four application servers were still not working.
»pretix is a multi-tenant application: With one software installation, it can handle lots of companies and institutions selling tickets. In pretix, they are called organizers, but in the more general case, we usually speak of “tenants” in the software industry. Building pretix this way is a design choice, we could just as well have created a software that only handles one company and run the software many times on logically or physically seperate systems for every event organizer. We decided to go with multi-tenancy in the software many years ago for a number of reasons.
»From the very beginning, privacy has been one of the most important design goals of pretix. Privacy is usually best and easiest achieved by just not collecting any unnecessary data that and we give our very best to live by that standard.
»Both our documentation as well as our website contain a number of screenshots of our software. Taking these screenshots manually is really tedious, since you first need to populate your database with sensible test data, select a proper display resolution and then take screenshots separately for every language our software supports, so we can use them in the localized websites properly.
»pretix comes with a test suite of currently 2167 tests. When executing these tests, 81 % of pretix’ codebase is run. These tests are intended to verify that pretix is operating correctly and errors and regressions are spotted early. Having such a test suite saved us from a lot of major problems in the past and will hopefully continue to do so in the future.
»In my last blogpost, I described our hosting setup for pretix.eu in detail and talked about the efforts we take to achieve a resistance against failing servers: The system should tolerate the failure of any single server at any given time and keep running. Well, up to now, this was only nearly true.
»This night, pretix.eu had a planned outage for around 70 minutes to allow us to make a fundamental change to our service architecture. In this blogpost, we want to go into detail on what we changed and why. We hope this might be insightful for you if you run a similar service or if you are just curious about the challenges we experience along the way.
»In version 1.8 of pretix, we introduced shipping management as a new feature for the pretix Hosted and Enterprise editions. With this plug-in, you can choose not to deliver your tickets to your visitors as a downloadable file, but to send them via snail mail instead. Of course, you can just download those tickets as regular-sized PDF files, print them out and ship them, but the feature is usually most interesting if you want to send out high-quality tickets that look good e.g. in a gift wrapping under a christmas tree or pinned to a wall as a souvenier of the event.
»Being somehow experienced with software development, we are still early in the process of learning how to get pretix to the right customers and turn pretix into a sustainable and profitable project. As one of many experiments on this journey, we presented at a major trade fair for the first time this week. With this blog post, we want to give detailled insight into the process, our decisions and our new experiences. If you are preparing to go on a trade show for the first time, this might give you valuable insight – or just a fun read.
»Welcome to behind.pretix.eu! What is this, you may ask? Yes, pretix already has a blog at pretix.eu but we sometimes want to share some backstage information with you that might be interesting to some of you who maintain open source projects, run software services or create businesses yourself.
»