Web Application & Software Architecture 101
Two tier applications:
User interface and busineess logic on same server and backend server is separate for data
access logic
Three tier applocations:
User interface and business logic are separate
So, if we take the example of a simple blog,
the user interface would be written using Html, JavaScript, CSS,
the backend application logic would run on a server like Apache &
the database would be MySQL. A three-tier architecture works best
for simple use cases.
Why need for N tier appications?
For single responsibility principle and separation of concern
Single responsibility principle is a reason, why I was never a fan of stored procedures.
Stored procedures enable us to add business logic to the database,
which is a big no for me. What if in future we want to plug in a different database?
Where do we take the business logic? To the new database? Or do we try to
refactor the application code & squeeze in the stored procedure logic somewhere?
A database should not hold business logic, it should only take care of persisting
the data. This is what the single responsibility principle is. And this is why
we have separate tiers for separate components.
Keeping the components separate makes them reusable. Different services can use
the same database, the messaging server or any component as long as they are not
tightly coupled with each other.
The typical examples of Fat clients are utility apps, online games etc.
In the industry layers of an application typically means the user interface layer,
business layer, service layer, or the data access layer.
Having loosely coupled components is the way to go. The approach makes scaling
the service easy in future when things grow beyond a certain level.
Server-Side Rendering #
Often the developers use a server
to render the user interface on the backend &
then send the rendered data to the client. The technique
is known as server-side rendering. I will discuss the pros
& cons of client-side vs server-side rendering further down the course.
As I stated earlier, for every response, there has to be a request first. The client sends the request & the server responds with the data. This is the default mode of HTTP communication, called the HTTP PULL mechanism.
Clients use AJAX (Asynchronous JavaScript & XML) to send requests to the server in the HTTP Pull based mechanism. HTTP Pull - Polling with Ajax:
Callback There are multiple technologies involved in the HTTP Push based mechanism such as:
Ajax Long polling Web Sockets HTML5 Event Source Message Queues Streaming over HTTP
But what if we are certain that the response will take more time than the TTL set by the browser?
Persistent Connection # In this case, we need a Persistent Connection between the client and the server. A persistent connection is a network connection between the client & the server that remains open for further requests & the responses, as opposed to being closed after a single communication.
Resource Intensive ?
Yes, it is. Persistent connections consume a lot of resources in comparison to the HTTP Pull behaviour. But there are use cases where establishing a persistent connection is vital to the feature of an application.
For instance, a browser-based multiplayer game has a pretty large amount of request-response activity within a certain time in comparison to a regular web application.
It would be apt to establish a persistent connection between the client and the server from a user experience standpoint.
Long opened connections can be implemented by multiple techniques such as Ajax Long Polling, Web Sockets, Server-Sent Events etc.
To avoid all this rendering time on the client, developers often render the UI on the server, generate HTML there & directly send the HTML page to the UI.
This technique is known as the Server-side rendering. It ensures faster rendering of the UI, averting the UI loading time in the browser window since the page is already created & the browser doesn’t have to do much assembling & rendering work.
Single Responsiblity principle: There should be only one reason for a class to change. Like model class should always be used to update a db. This is because this will keep code highly decoupled, and at the time of changing db, we should have to makes changes in model class, The way data in inserted/updated in the db without any affect on the business logic. Sachin Tendulakar's performance on pitch should only vary based on his sportsman ship, not how he was treated by his wife in the morning.
In the RESTful architecture, the backend defines what data is available for each resource on each URL, while the frontend always has to request all the information in a resource, even if only a part of it is needed.
In the worst case scenario, a client application has to read multiple resources through multiple network requests. This is called overfetching. A query language like GraphQL on the server-side and client-side lets the client decide which data it needs by making a single request to the server.
Queries are used for data fetching and mutations are used to modify server-side data. In the example below, you will see that a query has the exact same shape as the result. This essential GraphQL feature always provides you with the expected results because it lets the server know exactly what the client is asking for.
In the example photo below:
In the example below, a query is requesting multiple resources (author, article) which are called fields in GraphQL. It also requests a particular set of nested fields (name, urlSlug) for the field article, even though the entity itself offers more data in its GraphQL schema (e.g. description, releaseData for article). A RESTful architecture would need at least two waterfall requests to retrieve the author entity and its articles, but the GraphQL query made it happen in just one query. In addition, the query only selected and sent the necessary fields instead of bringing back the whole entity.
What about the code? Why does the code need to change when it has to run on multiple machines? # If you need to run the code in a distributed environment, it needs to be stateless. There should be no state in the code. What do I mean by that?
No static instances in the class. Static instances hold application data & if a particular server goes down all the static data/state is lost. The app is left in an inconsistent state.
Rather, use a persistent memory like a key-value store to hold the data & to remove all the state/static variable from the class. This is why functional programming got so popular with distributed systems. The functions don’t retain any state.
Always have a ballpark estimate on mind when designing your app. How much traffic will it have to deal with?
Development teams today are adopting a distributed micro-services architecture right from the start & the workloads are meant to be deployed on the cloud. So, inherently the workloads are horizontally scaled out on the fly.
Caching # Cache wisely. Cache everywhere. Cache all the static content. Hit the database only when it is really required. Try to serve all the read requests from the cache. Use a write-through cache.
Graph databases are faster as the relationships in them are not calculated at the query time, as it happens with the help of joins in the relational databases. Rather the relationships here are persisted in the data store in the form of edges and we just have to fetch them. No need to run any sort of computation at the query time.
A good real-life example of an application which would fit a graph database is Google Maps. Nodes represent the cities and the Edges represent the connection between them.
Now, if I have to look for roads between different cities, I don’t need joins to figure out the relationship between the cities when I run the query. I just need to fetch the edges which are already stored in the database.
key_value_dbs Typical use cases of a key value database are the following:
Caching Persisting user state Persisting user sessions Managing real-time data Implementing queues Creating leaderboards in online games & web apps Implementing a pub-sub system
Example of dynamic caching
Google-spanner is consistent and highly available 99.9999 for conseus resolution it uses paxos algorithm and raft algo also
Features Of A Message Queue #
Message queues facilitate asynchronous behaviour. We have already learned what asynchronous behaviour is in the AJAX lesson. Asynchronous behaviour allows the modules to communicate with each other in the background without hindering their primary tasks.
Real World Example Of A Message Queue #
Think of email as an example, both the sender and receiver of the email don’t have to be online at the same moment to communicate with each other. The sender sends an email, the message is temporarily stored on the message server until the recipient comes online and reads the message.
Message queues enable us to run background processes, tasks, batch jobs. Speaking of background processes, let’s understand this with the help of a use case.
Think of a user signing up on a portal. After he signs up, he is immediately allowed to navigate to the home page of the application, but the sign-up process isn’t complete yet. The system has to send a confirmation email to the registered email id of the user. Then the user has to click on the confirmation email for the confirmation of the sign-up event.
But the website cannot keep the user waiting until it sends the email to the user. Either he is allowed to navigate to the home page or he bounces off. So, this task is assigned as an asynchronous background process to a message queue. It sends an email to the user for confirmation while the user continues to browse the website.
This is how a message queue can be used to add asynchronous behaviour to a web application. Message queues are also used to implement notification systems just like Facebook notifications. I’ll discuss that in the upcoming lessons.
Polling doesnt provide real time data and also cost intensive due to polling of database
Using A Message Queue To Handle the Traffic Surge #
When millions of users around the world update an entity concurrently, we can queue all the update requests in a high throughput message queue. Then we can process them one by one in a FIFO First in First Out approach sequentially.?????????
How Facebook Handles Concurrent Requests On Its Live Video Streaming Service With a Message Queue? #
Facebook’s approach of handling concurrent user requests on its LIVE video streaming service is another good example of how queues can be used to efficiently handle the traffic surge.
On the platform, when a popular person goes LIVE there is a surge of user requests on the LIVE streaming server. To avert the incoming load on the server Facebook uses cache to intercept the traffic.
But, since the data is streamed LIVE often the cache is not populated with real-time data before the requests arrive. Now, this would naturally result in a cache-miss & the requests would move on to hit the streaming server.
To avert this, Facebook queues all the user requests, requesting for the same data. It fetches the data from the streaming server, populates the cache & then serves the queued requests from the cache.
In a distributed system the tasks are shared by several nodes on the contrary in a centralized system the tasks are queued in a queue to be processed one by one..
Shared Nothing Architecture Shared Nothing Architecture means eliminating all single points of failure. Every module has its own memory, own disk. So even if several modules in the system go down, the other modules online stay unaffected. It also helps with the scalability and performance.
What Is A Hexagonal Architecture? # The architecture consists of three components: Ports Adapters Domain The hexagonal shape of the structure doesn’t have anything to do with the pattern, it’s just a visual representation of the architecture. Initially, the architecture was called as the Ports and the Adapter pattern, later the name Hexagonal stuck.
Reactive Programming: a development model structured around asynchronous data streams.
Peer to Peer Architecture:
A P2P network is a network in which computers also known as nodes can communicate with each other without the need of a central server. The absence of a central server rules out the possibility of a single point of failure. All the computers in the network have equal rights. A node acts as a seeder and a leecher at the same time. So, even if some of the computers/nodes go down, the network & the communication is still up.
A Seeder is a node which hosts the data on its system and provides bandwidth to upload the data to the network, a Leecher is a node which downloads the data from the network.
Say a system hosts a large file of 75 Gigabytes. Other nodes in the network, in need of the file, locate the system containing the file. Then they download the file in chunks, re-hosting the downloaded chunk simultaneously, making it more available to the other users. This approach is known as Segmented P2P file transfer.
P2P architecture
The cult of the decentralized web is gaining ground in the present times. I can’t deny this that this is a disruptive tech with immense potential. Blockchain, Cryptocurrency is one example of this. It has taken the financial sector, in particular by storm.
There are numerous P2P applications available on the web for instance –
Tradepal
Peer to Peer digital cryptocurrencies like Bitcoin, Peercoin.
GitTorrent (a decentralized GitHub which uses BitTorrent and Bitcoin).
Twister (a decentralized microblogging service, which uses WebTorrent for media attachments).
Diaspora (a decentralized social network implementing the federated architecture).
Federated architecture is an extension of the decentralized architecture, used in decentralized social networks, which I am going to discuss up next.
1.CORS concept