Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

99,00 EGP

Description

Price: $0.99
(as of Nov 22,2024 16:45:24 UTC – Details)


Customers say

Customers find the subject material well-organized, detailed, and thorough. They describe the book as amazing, a good read for techies, and well worth picking up. Readers praise the writing quality as clear, fantastic, and easy to read. However, some customers report missing pages.

AI-generated from the text of customer reviews

This Post Has 7 Comments

  1. Essential reading for anyone working on distributed systems in any capacity
    Designing Data-Intensive Applications really exceeded my expectations. Even if you are experienced in this area this book will re-enforce things you know (or sort of know) and bring to light new ways of thinking about solving distributed systems and data problems. It will give you a solid understanding of how to choose the right tech for different use cases.
    The book really pulls you in with an intro that is more high level, but mentions problems and solutions that really anyone who has worked on these types of applications have either encountered or heard mention of. The promise it makes is to take these issues such as scalability, maintainability and durability and explain how to decide on the right solutions to these issues for the problems you are solving. It does an amazing job of that throughout the book.
    This book covers a lot, but at the same time it knows exactly when to go deep on a subject. Right when it seems like it may be going too deep on things like how different types of databases are implemented (SSTables, B-trees, etc.) or on comparing different consensus algorithms, it is quick to point out how and why those things are important to practical real-world problems and how understanding those things is actually vital to the success of a system.
    Along those same lines it is excellent at circling back to concepts introduced at prior points in the book. For example the book goes into how log based storage is used for some databases as their core way of storing data and for durability in other cases. Later in the book when getting into different message/eventing systems such as Kafka and ActiveMQ things swing back to how these systems utilize log based storage in similar ways. Even if you have prior knowledge or even have worked with these technologies, how and why they work and the pros and cons of each become crystal clear and really solidified. Same can be said of it’s great explanations of things like ZooKeeper and why specific solutions like Kafka make use of it.
    This book is also amazing at shedding light on the fact that so little of what is out there is totally new, it attempts to go back as far as it can at times on where a certain technology’s ideas originated (back to the 1800s at some points!). Bringing in this history really gives a lot of context around the original problems that were being solved, which in turn helps understanding pros and cons. One example is the way it goes through the history of batch processing systems and HDFS. The author starts with MapReduce and relating it to tech that was developed decades before. This really clarifies how we got from batch processing systems on proprietary hardware to things like MapReduce on commodity hardware thanks in part to HDFS, eventually to stream based processing. It also does great at explaining the pros and cons of each and when one might choose one technology over the other.
    That’s really the theme of this book, teaching the reader how to compare and contrast different technologies for solving distributed systems and data problems. It teaches you to read between the lines on how certain technologies work so that you can identify the pros and cons early and without needing them to be spelled out by the authors of those technologies. When thinking about databases it teaches you to really consider the durability/scalability model and how things are no where near black and white between “consistent” vs “eventually consistent”, these is a ton of nuance there and it goes deep on things like single vs multi leader vs leaderless, linearizability, total order broadcast, and different consensus algorithms.
    I could go on forever about this book. To name a few other things it touches on to get a good idea of the breadth here: networking (and networking faults), OLAP, OLTP, 2 phase locking, graph databases, 2 phase commit, data encoding, general fault tolerance, compatibility, message passing, everything I mentioned above, and the list goes on and on and on. I recommend anyone who does any kind of work with these systems takes the time to read this book. All 600ish pages are worth reading, and it’s presented in an excellent, engaging way with real world practical examples for everything.

  2. An exceptionally good review of the state of the art
    It is really hard to overstate how comprehensively this book covers nearly everything that is currently known about building large, scalable, high performance, data centric applications. If every Kafka queueing, Cassandra clustering, Redis loving, Kinesis slinging, Map reducing, CAP theorem quoting systems engineer read this book, the world would actually be a better place. It really is that good.
    The front pages contain a quote from Alan Kay that I will summarize as “most people who write code for money … have no idea where [their culture came from].” This book will learn you some of the culture you are missing! Every developer writing modern Internet facing application software, particularly in cloud computing environments, will run into the problems described here. Far too many of these developers will pick up a grab bag of half baked solutions from reading various Stack Overflow posts, blogs from better informed writers, and from hyped up claims made by the currently trendy “technologies.” Many of these sources will obscure the fundamental nature of the underlying problems, and will lead said developers to overly naive designs, and provide a false sense of security. Such systems will even work pretty nicely for a while, but they will usually fail spectacularly when they are actually presented with component failures or high system load (or both at the same time, which is quite typical).
    This book talks about the underlying structure of the problems we all face when building contemporary distributed applications. It ties together all the foundational aspects of both distributed computing and data storage in a chorent manner. It teaches you how to think about the problem space by demonstrating where many popular and widely used software products fit. You’re not going to learn about any one single product. Instead you will learn what you must know to evaluate as many of them as you want, learn which interactions between different components matter, and then make informed choices for your own design.
    This is not an academic text book, it is a working professional’s guide to the field. It has all the references to the classic papers and textbooks that form the formal foundations of the subject, But it is so clearly written, and accessible, that you could go a very long way without needing to read any of them.

  3. Es una compilación de mejores prácticas y conceptos hasta cierto punto “universales” que ayudan a entender las aplicaciones de alto rendimiento ejecutadas sobre entornos de alta disponibilidad en nuestros días.
    Sentido común para profesionales de TI, que nunca está de más repasar.
    Muy vigente, muy ameno, cuando termine de leerlo le doy un update a la reseña, por lo que llevo, mis diez.

  4. This book is dense, which is both a pro and con. This is not a book you lightly read in the afternoon. This book took me weeks, if not months, to properly read through and comprehend. The good news, however, is that you are much more knowledgeable about big data and distributed systems.
    This book leans far more towards theory than practice, although contemporary technologies are discussed and analyzed. As of 2023/2024, some of the examples are a bit dated.
    I would approach this book like an undergraduate textbook. You need to pace yourself and examine supplementary material to gain a full understanding. Some of the facts and conclusions are non-obvious and need time to percolate inside your mind to glean the most information from this book.
    The print quality was acceptable and the book was sturdy enough. However, it’s bulky and you may look out of place trying to read this in a cafe.
    If possible, I would recommend reading this as part of a study group so you can discuss ideas. I read this by myself and sometimes felt lost since I had no one to discuss it with.
    Overall, this book definitely lives up to its hype.

  5. Livro completo, mostra as entranhas e os mecanismos dos sistemas distribuídos modernos com bastante ênfase a streams e Big data. Muito já consideram a obra como um clássico e não é atoa, é por merecimento! Recomendadíssimo!

  6. I have not read the full book yet but almost every chapter I read is in some way applicable to my day to day work at the Fintech that I work at. The examples and explanations this book contains are very clear, simple to understand and relevant to the real world.
    Very interesting text book. Quite a page turner.

Leave a Reply

Your email address will not be published. Required fields are marked *