Posts about AWS

Amazon Sagemaker: What, Why and How

Posted by Gaurav Mishra on Dec 27, 2019 4:59:00 PM

According to IDC, the Artificial Intelligence market will attain a gigantic 37% compound annual growth by 2022. Owing to its popularity, several tools and software have emerged in the market to make AI adaptation easier. However, one tool that clearly stands out in all respects is Amazon Sagemaker. In this blog, we take an in-depth look at what it is, why use it, and how to go about its usage.

What is Amazon Sagemaker?

Amazon SageMaker is a fully managed AWS solution that empowers data scientists and developers to quickly build, train, and deploy machine learning models. It is in the form of an integrated development environment for machine learning, the Amazon SageMaker Studio, which acts as a base to build upon a collection of other AWS SageMaker tools.

You can build and train ML models from scratch or purchase pre-built algorithms that suit your project requirements. Similar tools are available for debugging models or adding manual review processes atop model predictions.
amazon sagemaker what it isImage via Amazon

Why Should You Use It?

The complexity of the machine learning project in any enterprise increases with the expansion of scale. This is because machine learning projects comprise of three key stages - build, train and deploy - each of which can continuously loop back into each other as the project progresses. And as the amount of data being dealt with increases, so does the complexity. And if you are planning to build a ML model that truly works, your training data sets will tend to be on the larger side.

Typically, different skill sets are required at different stages of a machine learning project. Data scientists are involved in researching and formulating the machine learning model, while developers are the ones taking the model and transforming it into a useful, scalable product or web-service API. But not every enterprise can put together a skilled team like that, or achieve the necessary coordination between data scientists and developers to roll out workable ML models at scale.

This is exactly where Amazon Sagemaker steps in. As a fully managed machine learning platform, SageMaker abstracts the software skills, enabling data engineers to build and train the machine learning models they want with an intuitive and easy-to-use set of tools. While they play to the core strengths of working with the data and crafting the ML models, the heavy lifting needed for developing these into a ready-to-roll web-service API is handled by Amazon Sagemaker.

Amazon SageMaker packs all the components used for machine learning in a single shell, allowing data scientist to deliver end-to-end ML projects, with reduced effort and at lower cost.

How It Works?

With a 3-step model of Build-Train-Deploy, Amazon SageMaker simplifies and streamlines your machine learning modeling. Let’s take a quick look at how it works.


Amazon SageMaker offers you a completely integrated development environment for machine learning that lets you improve your productivity. With the help of its one-click Jupyter notebooks, you can build and collaborate with lightning speed. Sagemaker also offers you a one-click sharing facility for these notebooks. The entire coding structure is captured automatically, which allows you to collaborate with others without any hurdle.

Apart from this, the Amazon SageMaker Autopilot is the first automated machine learning capability of this industry. It allows you to have complete control as well as visibility into your respective machine learning models. The traditional approaches of automated machine learning do not allow you to peek in the data or logic used to create that model. However, the Amazon Sagemaker Autopilot is capable of integrating with Sagemaker Studio and provides you complete visibility into the raw data and information used in the creation.

One of the highlights of Amazon SageMaker is its Ground Truth feature that helps you in building as well as managing precise training datasets without facing any hurdle. The Ground Truth provides you complete access to the labelers via Amazon Mechanical Trunk along with pre-built workflows as well as interfaces for common labeling tasks. The Amazon Sagemaker comes with the support of various deep learning frameworks including PyTorch, TensorFlow, Apache MXNet, Chainer, Gluon, Keras, Scikit-learn, and Deep-Graph library.

Leveraging Amazon Sagemaker, Srijan built a video analytics solutions that can scrape video feed data to log asset performance.


Using AWS Lambda, Amazon SageMaker and Amazon S3, Srijan developed a video analytics solution for the client. The solution utilized a machine learning model to scrape video feed data and log asset performance over a given period of time and assigned location.

As a result, it helped in:

  • Claims validation against machines that were failing to clean the given sites
  • Insight based behavior analysis of the assets, leading to improvement of the product
  • Enabling more proactive, instead of reactive, asset performance assessment and maintenance

View the Complete Case Study



Using Amazon SageMaker Experiments, you can easily organize, track, and evaluate every iteration to machine learning models. Training a machine learning model packs various iterations to measure and isolate the impact of changing algorithm versions, model parameters, and changing datasets. The Sagemaker Experiments help you in managing these iterations via capturing the configurations, parameters, and results automatically, and storing them as ‘experiments’.

SageMaker comes with a debugger functionality that is capable of analyzing, debugging, and fixing all the problems in your machine learning model. Debugger makes the training process entirely transparent by capturing real-time metrics during the process. The Sagemaker Debugger also comes with a facility of generating warnings as well as remediation advice if any common problems are detected during the training process.

Apart from this, AWS Tensorflow optimization offers you a scaling facility of up to 90% with the help of its gigantic 256 GPUs. Using this, you can experience precise, and sophisticated training models in very little time. Furthermore, the Amazon Sagemaker comes with a Managed Spot Training that helps reduce training costs up to 90%.


Amazon SageMaker offers you a one-click deployment facility so that you can easily generate predictions for batch or real-time data. You can easily deploy your model on auto-scaling Amazon machine learning instances across various availability zones for improved redundancy. You just need to specify the desired maximum and minimum numbers, and the type of instance, and then leave the rest to Amazon Sagemaker.

The major problem that can affect the accuracy of your entire operation is the difference between data used to generate predictions and the data used to train models. The SageMaker Model Monitor can help you in getting out of this puzzle by detecting and remediating concept drift. The Sagemaker Model Monitor detects the concept drift in all of your deployed models automatically and then provides alerts to identify the main source of the problem.

The Amazon Sagemaker also packs Augmented AI facility, with the help of which, you can easily allow human reviewers to step in if the model is unable to make high confidence precise predictions. Moreover, the Amazon Elastic Inference is capable of minimizing your machine learning inference costs by 75%. Lastly, Amazon also allows you to integrate Sagemaker with Kubernetes, by which you can easily automate the deployment, scale, and management of your applications.

So there you have it, a look at how Amazon Sagemaker can help build, train and deploy machine learning models to suit your project requirements. 

Srijan is an advanced AWS Consulting Partner, and can help you utilize AWS solutions at your enterprise. To know more, drop us a line outlining your business requirements and our expert AWS team will get in touch.

Topics: AWS, Machine Learning & AI, Architecture

Why Should Your Organization Opt for Infrastructure as a Service (IaaS)

Posted by Kimi Mahajan on Nov 29, 2019 1:29:00 PM

Businesses are getting rid of keeping data in traditional data centers and physical servers and are migrating to innovative and reliable cloud technologies. With several benefits of cloud computing including anytime data access, enhanced disaster recovery, improved flexibility and reduced infrastructure staff burden, enterprises are developing more cost-efficient applications with higher performance and more effortless scalability.

IaaS, one such cloud computing model, has made lives of both enterprises and developers simpler by reducing their burden of thinking about infrastructure.

But, how do enterprises know if they need to opt-in for IaaS?

Understanding Infrastructure as a Service (IaaS)

IaaS refers to the cloud services offered over a network allowing businesses to access their infrastructure remotely. A perfect fit for any size enterprise, it offers the advantage of not having to buy hardware or other equipment, and easily manage firewalls, IP addresses, servers, routers, load balancing, virtual desktop hosting, storage, and much more, cost-effectively through a scalable cloud model.

It gives organizations the flexibility to spend only for the services used, which gives an edge to IaaS cloud computing over traditional on-premise resources. The businesses find it easier to scale by paying per usage from an unlimited pool of computing resources instead of wasting resources on new hardware.

Understanding Infrastructure as a Service (IaaS)

Why Opt For IaaS Cloud Model?

IaaS is beneficial for organizations for a number of reasons. Let’s discuss its benefits in detail-

Usage of Virtual Resources

Your organization might never have to think of investing in resources such as CPU cores, hard disk or storage space, RAM, virtual network switches, VLANs, IP addresses and more, giving you the feeling of owning a virtual datacenter.

It allows multiple users to access a single hardware anywhere and anytime over an internet connection, keeping their users on the move. And in case even if a server goes down or a hardware fails, its services aren’t affected, offering greater reliability.

Cost Savings With Pay-As-You-Go Pricing Model

With metered usage, enterprises need to pay for the time when the services were used and avoid fixed monthly and annual rental fees and any upfront charges. This is beneficial as it leads to lower infrastructure costs and also prevents them from having to buy more capacity to have a back-up for a sudden business spike. IaaS providers gives users an opportunity to purchase storage space, wherein they need to be careful as the pricing may differ with providers.

Highly Scalable, Flexible and Quicker

One of the greatest benefits of IaaS is the ability to scale up and down quickly in response to an enterprise’s requirements. IaaS providers generally have the latest, most powerful storage, servers and networking technology to accommodate the needs of their customers. This on-demand scalability provides added flexibility and greater agility to respond to changing opportunities and requirements. Also, with IaaS the process of time to market the product is much more fastened to get the job done.

High Availability

Business continuity and preparing for disaster recovery are the top drivers for adopting IaaS infrastructure. It remains a highly available infrastructure, and unlike the traditional hostings, even in case of a disaster, it offers its users the flexibility to access the infrastructure via an internet connection.

With a robust architecture and scalable infrastructure layer, organizations can consolidate their different disaster recovery systems into a virtualized environment for disaster recovery, for securing their data. This stands as the perfect use case for IaaS.

By outsourcing their infrastructure, organizations can focus their time and resources on innovation and developing new techniques in applications and solutions.

How Do You Choose Between IaaS, Containers or Serverless?

The next question you might have is how to make a choice between opting for IaaS cloud computing model, containers or serverless model?

Well, the one thing they all share in common is that they simplify the developer’s life by letting them focus only on generating code. Let’s look into the differences:






Instantly available virtualized computing resources over the internet, eliminating the need of hardware 

Contains application and associated elements needed to run the application  properly with all dependencies

Broken up into functions and hosted by a third-party vendor

Use Case

Organizations can consolidate their disaster recovery systems into one virtualized environment for backup, securing data

Refactoring bigger monolithic application into smaller independent parts, eg: splitting a large application into a few separate services such as  user management, media conversion etc.

For applications which do not always need to be running.

Vendor Operability

Cloud vendor manages infrastructure

No vendor lock-in

Vendor lock-in

Pricing Model


At least one VM instance with containers hosted is always running, hence costlier than serverless.

Pay for what you use; cost-effective


User responsible for patching and security hardening

Not maintained by cloud providers; developers are responsible for its maintenance

Nothing to manage

Web Technology Hosting

Can host any technology, Windows, Linux, any web server technology

Only Linux-based deployments

Not made for hosting web applications





Deployment Time

Instantly available

Take longer to set up initially than serverless 

Take milliseconds to deploy


IaaS is the most flexible model and suits best to the needs of temporary, experimental and unexpected workloads. Srijan is an Advanced AWS Consulting Partner. Leveraging AWS’s vast repository of tools, we can help you choose the best option for outsourcing your infrastructure for you to achieve your business goals. Contact us to get started with your IaaS journey.


Topics: AWS, Cloud, Architecture

AWS Glue: Simple, Flexible, and Cost-effective ETL For Your Enterprise

Posted by Gaurav Mishra on Oct 31, 2019 6:28:00 PM

An Amazon solution, AWS Glue is a fully managed extract, transform, and load (ETL) service that allows you to prepare your data for analytics. Using the AWS Glue Data Catalog gives a unified view of your data, so that you can clean, enrich and catalog it properly. This further ensures that your data is immediately searchable, queryable, and available for ETL.

It offers the following benefits:

  • Less Hassle: Since AWS Glue is integrated across a wide range of AWS services, it natively supports data stored in Amazon Aurora, Amazon RDS engines, Amazon Redshift, Amazon S3, as well as common database engines and Amazon VPC. This leads to reduced hassle while onboarding.
  • Cost Effectiveness: AWS Glue is serverless, so there are no compute resources to configure and manage. Additionally, it handles provisioning, configuration, and scaling of the resources required to run your ETL jobs on a fully managed, scale-out Apache Spark environment. This is quite cost effective as you pay only for the resources used while your jobs are running.
  • More Power: AWS Glue automates much of the effort spent in building, maintaining, and running ETL jobs. It crawls your data sources, identifies data formats, and suggests schemas and transformations. It even automatically generates the code to execute your data transformations and loading processes.

AWS Glue helps enterprises significantly reduce the cost, complexity, and time spent creating ETL jobs. Here’s a detailed look on why use AWS Glue:

Why Should You Use AWS Glue?

AWS Glue brings with it the following unmatched features that provide innumerable benefits to your enterprise:

Integrated Data Catalog

AWS Glue consists of an integrated Data Catalog which is a central metadata repository of all data assets, irrespective of where they are located. It contains table definitions, job definitions, and other control information that can help you manage your AWS Glue environment. 

Using the Data Catalog can help you automate much of the undifferentiated heavy lifting involved in cleaning, categorizing or enriching the data, so you can spend more time analyzing the data. It computes statistics and registers partitions automatically so as to make queries against your data both efficient and cost-effective.

Clean and Deduplicate Data

You can clean and prepare your data for analysis by using an AWS Glue Machine Learning Transform called FindMatches, which enables deduplication and finding matching records. And you don’t need to know machine learning to be able to do this. FindMatches will just ask you to label sets of records as either “matching” or “not matching”. Then the system will learn your criteria for calling a pair of records a “match” and will accordingly build an ML Transform. You can then use it to find duplicate records or matching records across databases.

Automatic Schema Discovery

AWS Glue crawlers connect to your source or target data store, and progresses through a prioritized list of classifiers to determine the schema for your data. It then creates metadata and stores in tables in your AWS Glue Data Catalog. The metadata is used in the authoring process of your ETL jobs. In order to make sure that your metadata is up-to-date, you can run crawlers on a schedule, on-demand, or trigger them based on any event.

Code Generation

AWS Glue can automatically generate code to extract, transform, and load your data. You simply point AWS Glue to your data source and target, and it will create ETL scripts to transform, flatten, and enrich your data. The code is generated in Scala or Python and written for Apache Spark.

Developer Endpoints

AWS Glue development endpoints enable you to edit, debug, and test the code that it generates for you. You can use your favorite IDE (Integrated development environment) or notebook. Or write custom readers, writers, or transformations and import them into your AWS Glue ETL jobs as custom libraries. You can also use and share code with other developers using the GitHub repository.

Flexible Job Scheduler

You can easily invoke AWS Glue jobs on schedule, on-demand, or based on an event. Or start multiple parallel jobs and specify dependencies among them in order to build complex ETL pipelines. AWS Glue can handle all inter-job dependencies, filter bad data, and retry jobs if they fail. Also, all logs and notifications are pushed to Amazon CloudWatch so you can monitor and get alerts from a central service.

How It Works?

You are now familiar with the features of AWS Glue, and the benefits it brings for your enterprise. But how should you use it? Surprisingly, creating and running an ETL job is just a matter of few clicks in the AWS Management Console. 

All you need to do is point AWS Glue to your data stored on AWS, and AWS Glue will discover your data and store the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog. Once cataloged, your data is immediately searchable, queryable, and available for ETL.

Here’s how it works:

  • Define crawlers to scan data coming into S3 and populate the metadata catalog. You can schedule this scanning at a set frequency or to trigger at every event
  • Define the ETL pipeline and AWS Glue with generate the ETL code on Python
  • Once the ETL job is set up, AWS Glue manages its running on a Spark cluster infrastructure, and you are charged only when the job runs

The AWS Glue catalog lives outside your data processing engines, and keeps the metadata decoupled. So different processing engines can simultaneously query the metadata for their different individual use cases. The metadata can be exposed with an API layer using API Gateway and route all catalog queries through it.

When to Use It?

What with all the information around AWS Glue, if you do not know where to put it in use? Here’s a look at some of the use case scenarios and how AWS Glue can make your work easier:

1 Queries Against an Amazon S3 Data Lake

Looking to build your own custom Amazon S3 data lake architecture? AWS Glue can make it possible immediately, by making all your data available for analytics even without moving the data. 

2 Analyze Log Data in Your Data Warehouse

Using AWS Glue, you can easily process all the semi-structured data in your data warehouse for analytics. It generates the schema for your data sets, creates ETL code to transform, flatten, and enrich your data, and loads your data warehouse on a recurring basis.

3 Unified View of Your Data Across Multiple Data Stores

AWS Glue Data Catalog allows you to quickly discover and search across multiple AWS data sets without moving the data. It gives a unified view of your data, and makes cataloged data easily available for search and query using Amazon Athena, Amazon EMR, and Amazon Redshift Spectrum.

4 Event-driven ETL Pipelines

AWS Glue can run your ETL jobs based on an event, such as getting a new data set. For example, you can use an AWS Lambda function to trigger your ETL jobs to run as soon as new data becomes available in Amazon S3. You can also register this new dataset in the AWS Glue Data Catalog as part of your ETL jobs.

So there you have it, a look at how AWS Glue can help manage your data cataloguing process, and automation of the ETL pipeline. 

Srijan is an advanced AWS Consulting Partner, and can help you utilize AWS solutions at your enterprise. To know more, drop us a line outlining your business requirements and our expert AWS team will get in touch.

Topics: AWS, Architecture

Cloud migration paths - Which ones should you choose?

Posted by Urvashi Melwani on Sep 11, 2019 3:05:00 PM

As more infrastructure and applications are experiencing a shift towards cloud in reinforcing digital transformation, one of the most critical decisions that enterprises must make well ahead of time is the best approach to cloud migration for the long-term success of their enterprises.

As per the survey conducted by Netscouts in 2018, majority of the enterprise, i.e., 56% of respondents had already started workload migration. Besides, there were 14% respondents who were in the planning stage and  rest 15% had plans to carry out the migration process in less than 6 months to 1 year.

Boxes of various lengths and table with text                                                           Source: Netscout         

And as apparent that there’s no one-sit-fits-all answer; up-front planning would make the migration process easier, and rather the whole cloud transition smoother.

So which is the best cloud migration approach for your business?

This blog takes a look at the three distinct migration approaches to help you choose the right one.

It’s time to reach the cloud

Additionally, this report also predicts that 80% of the companies are feeling the need to move their workloads to the cloud as soon as possible. And although there are multiple approaches for the same, but we will discuss the three most common here. Naturally, there are benefits and disadvantages to each:

  1. Lift and shift aka Rehost
  2. Lift, Tinker, and shift aka Replatform
  3. Refactor

1. Lift and Shift or Rehost 

Rehosting or the lift and shift approach is a forklift approach to migrate applications to the cloud without any modifications in the code. The approach involves lifting either some part of the whole application from on-premise or existing cloud environment to a new cloud environment.

Currently, it is considered as the most common migration methods. It comprises 40% of all migrations because of its agility, simplicity, and speed in comparison to re-platforming and refactoring.                        Two seperate boxes with multiple partitions

This is beneficial for the large enterprises who want to migrate quickly with minimal or no disturbance in the existing application workflow. 

And once the migration is done, it becomes quite easier for them to optimize the applications as they are already done away with the difficult part.

When to choose this approach?

“This works best for organizations looking to reduce their on-premises infrastructure expenses immediately”

Here are some common instances when enterprises should choose the rehosting approach-

  • Large number of migrations over time
    This lift-and-shift approach should be opted if it's simple, quick, and cheap and you have a lot of migrations to do overtime. Additionally, you need to factor the plan, and budget all of the post-migration work involved, like in case you have lifted and shifted non-cloud tools, processes, and people into the cloud.
  • Urgency or pain point
    A common compelling event could be the urgent evacuation of a data center or hosting provider.
    This works best for organizations looking to reduce their on-premises infrastructure expenses immediately, those bearing too much cost in maintaining physical infrastructure or if you have been faced with some cloud disaster (e.g. corrupted database). They should opt for application rehosting to get their applications on the cloud with minor or no modification and also enjoy back up of these for smooth and fast running.
  • Commercial and off-the-shelf applications
    It forms as an apt choice for organizations having some applications on-board that need to be running without any intervention or modification. These are generally commercial and off the shelf applications, and rehosting is a good strategy to first move it onto the cloud with this approach as it is and then optimize.
  • Virtualization and IaaS skillset
    If the available resources are skilled in virtualization and infrastructure as a service, then rehosting matches their skill sets (whereas Replatforming and Refactoring need more skills)
  • Test environments
    Application testing makes an important environment to run the apps successfully. However, if they aren’t managed well, it can be done easily with a lift-and-shift approach to avoid disruption.

Benefits of Rehosting

The benefits of the lift-and-shift approach are-

  • Quick migration
  • Reduced risk (simplicity)
  • Application, hypervisor, and hardware agnostic
  • Can be highly automated with limited or zero downtime
  • Imports configuration and scripts if these are not documented/ hard to reverse engineer

Limitations of the Rehosting approach

“The rehosting method does not let you reap benefits from the native cloud functionality and tools like elasticity”

The rehosting approach works because it is simpler in terms of migration.  However, it involves risks and limitations with it-

  • Migrating brittle processes

When you migrate an application, you also inherit the operating system, generally undocumented configurations, and non-cloud people and processes with it. So, if these processes are not clearly understood pre-migration, this will lead to a fragile application and a brittle end product.

  • Cloud-native features

The rehosting method does not let you reap benefits from the native cloud functionality and tools like elasticity. The app functions the way it should in a single physical server but does not let you to take advantage of added flexibility and scalability offered by cloud environments.

  • Rehosted applications are black boxes

Simply copy-pasting the applications and data without understanding what’s in them implies that you are pulling out everything into the cloud, including malware or insecure configurations.

  • Unbudgeted/planned post rehosting activities

There are always post-rehosting activities that need to be taken care of. This involves additional cost beyond the basic migration process, in regards to money, time, and resources. These activities, if avoided, will prove costly in the long run, with high expenditure incurred on over-provisioned resources.

  • Ingest known and unknown problems

If the application is facing problem outside the cloud, known or unknown, Rehosting will likely bring that problem to the cloud. Retiring technical debt is a big plus of more advanced migration methods like Replatforming and Refactoring or drop-and-shop technique of Repurchasing. 

 2. Lift, Tinker, and Shift or Replatform approach

In replatforming migration, a part of the application or the entire application is optimized with a small amount of up-versioning in API before moving to the cloud.

This varies from adding one or two functionalities to it to completely re-architecturing them before they can be rehosted or refactored and eventually deployed to the cloud. 

Multiple boxes with text written inside them

“Developers can also reuse the resources they are accustomed to working with”

The replatforming approach ensures an interim solution between rehosting and refactoring, allowing workloads to take advantage of base cloud functionality and cost optimization, without the level of resource commitment required.

Developers can also reuse the resources they are accustomed to working with, such as legacy programming languages, development frameworks, and existing caches in the application.

Replatforming can be used to add new features for better scaling and leveraging the reserved resources of your cloud environment. There are even ways to integrate the app with native features of the cloud while little or no code modifications.

When to choose this approach?

Take a look at these scenarios when to opt for this approach-

“Replatforming allows you to reshape them to make it compatible with the cloud”

  • Modification of applications is required
    Replatforming is suitable when organizations want to make changes in the API of the applications (up-versioning) and then deploy it to the cloud. This may be because the source environment is not supporting the cloud, or the organizations want some minor changes without hampering the application’s functioning.
    In such cases, some fine-tuning is required and for that re-platforming is the optimum choice.
  • Avoid post-migration work
    Organizations who deployed rehosting method realized that there is a slew of tasks that needs to be done post-migration to realize the full potential of the cloud. So, the feasible solution is to simply make the changes in the application during the migration itself. Hence, re-platforming works best in such a scenario.
  • Experience with more cloud skills
    If you have the resources available in your organization who have been working with cloud-based solutions lately and can now shape applications for cloud compatibility, or take shortcuts in the migration process, consider using the Replatforming approach.
  • Most apps are common three-tier web apps
    When most of the apps are three-tier web apps, Replatforming allows you to reshape them to make it compatible with the cloud. And once you have reshaped one, you can leverage this far and wide, making significant efforts to improve efficiencies in migration as you move forward.

Benefits of Re-platforming

“Enterprises can leverage cloud-native functionalities without worrying about the risk, complexity, cost, and time of a full refactor”

Replatforming is a cost-efficient solution. It is an optimal place of action between rehosting and refactoring, where enterprises can leverage cloud-native functionalities without worrying about the risk, complexity, cost, and time of a full refactor.

This approach does not require to adjust the cloud server to match the previous environment. Instead, you have the flexibility to start small and scale up as needed, which indicates that you can save a lot while the cloud environment grows with the app itself.


various elements in rectangle

Its benefits include-

  • Use of cloud-native functionalities
  • Apps can leverage the base cloud cost application
  • Helps achieve tactical benefits, like reducing the amount of time spent managing database instances
  • Reduce/ replace the common application components with a better cloud service, such as replacing Nginx in a VM with AWS Elastic Load Balancer.

Limitations of Replatforming

“If the cloud service used to replace a component is inappropriate or poorly configured, then the re-platform migration can go wrong”. 

The major risk associated with re-platforming is that the project scope can grow and change if unchecked during the process, to become a complete refactor. Managing scope and avoiding unnecessary changes is key to mitigate this risk.

Secondly, if the cloud service used to replace a component is inappropriate or poorly configured, the replatform migration can go wrong. 

Other limitations include:

  • Overly aggressive change
    Every individual shape during re-platforming increases the risk of causing problems: be circumspect and choose common, well-known shapings. Avoid exotic changes unless it’s a niche opportunity or unavoidable. The goal is a successful re-platform, not an exotic one.
  • Automation is required
    Although the re-platforming approach can be done manually, it has limitations as modifications could be time taking. A better solution, therefore, is to model the application needs using an automation platform and then make modifications to the model to represent the platform shapings.


Watch this video to understand further in a better way-


A summary of the pros and cons of each approach include:





  • Minimal work required to move application

  • Faster migration and deployment
  • Typically does not take advantage of native features of the cloud application

  • May cost more to operate in a cloud

Partial Refactor

  • Only parts of the application are modified

  • Faster migration and deployment than complete refactoring
  • Only takes advantage of some features of the cloud

  • May cost more to operate in a cloud

Complete Refactor

  • Applications typically offer higher performance

  • Applications can be optimized to operate at lower costs
  • Much higher cost since most of the part of application  must change

  • Slower time to deployment

3. Re-architect or Refactor approach

Refactoring is the process where you run your applications on the infrastructure of your cloud provider, also referred to as Platform as a Service (PaaS).

Refactoring is a bit more complex than the other two as while making changes to the code in the application, it must be ensured that they do not impact the external behavior of the application. For example, if your existing application is resource-intensive, it may cause larger cloud billing because it involves big data processing or image rendering. In that case, redesigning the application for a better resource utilization is required before moving to the cloud.

Squares, rectangles and other elements connected to each other

This approach is the most time- consuming and resource-demanding, yet it can offer the lowest monthly spend of the three approaches. And also the full potential of cloud to increase performance, resilience, and responsiveness.

When to choose this approach?

Refactoring comes in handy for the enterprises in the following scenarios-

“Refactoring method helps in reducing cost and improvements in operations, resilience, responsiveness, and security”

  • Enterprises want to leverage cloud benefits
    Refactoring is the best choice when there is a strong business requirement of appending features, scale or enhance performance by deploying cloud- which otherwise is not possible in the existing non-cloud environment. Simply put, the old ways don’t qualify the criteria and if you still stick to the old ways; your business might flip-over as an existential threat in this phase of cut-throat competition.
  • Scaling up or restructuring code
    When an organization is looking to expand its existing application or wants to restructure their code to draw off the complete potential of their cloud capabilities.
  • Boost agility
    If your organization aspires to amplify agility, improve business continuity by moving to a service-based architecture, then this strategy does the trick. And that’s despite the fact that it is often the most expensive solution in the short-medium term.
  • Efficiency is a priority
    Refactoring method helps in reducing cost and improvements in operations, resilience, responsiveness, and security.

Further, you have the option to choose between partial or complete refactor, depending upon your needs. Partial refactor involves modification of the small part of the application which results in faster migration compared to complete refactor.

Benefits of Refactoring

The benefits of refactoring are observed in the future. The current application and its environment configuration determine the complexity of refactoring, and that impacts the time-to-value from the project.

Its benefits include:

“This approach ensures an over-time reduction in costs, matching resource consumption with the demand, and eliminating the waste”

  • Long-term cost reduction
    This approach ensures an over-time reduction in costs, matching resource consumption with the demand, and eliminating the waste. Hence, this brings a better, and more lasting ROI compared to the less cloud-native applications.
  • Increase resilience
    Decoupling the application elements and attaching highly-available and managed services, the application inherits the resilience of the cloud.
  • Responsive to business events
    This approach lets application leverage the auto-scaling features of cloud services that scale up and down as per demand.

Limitations of Refactoring

The limitations are here-

  • Vendor lock-in
    The more cloud-native your application is, the more tightly it is coupled to the cloud you are in.
  • Skills
    Refactoring demands the highest level of application, automation, and cloud skills and experience to carry out the process.
  • Time
    As refactoring is the complicated method of migrating from a non-cloud application to a cloud-native application, it can consume a considerable amount of time.
  • Getting it wrong
    Refactoring involves changing everything about the application, so it has the maximum probability of things going the other way round. Each mistake will cause delays, cost imbalances, and potential outranges.

Refactoring is a complex process but it is well worth the results and improvement that you get in return. It is a resource-demanding process, one that requires plenty of time to complete. Some companies even go as far as refactoring parts of their business solutions to make the whole process more manageable. This compartmentalization could also lead to refactoring becoming longer and more resource exhausting.

Final words

Which one is the best approach?

There is no absolute answer to the question, especially since different use cases require different things. Picking one among the three approaches is a matter of finding the best that suits your specific needs. That said, start by checking if the app can be moved to a cloud environment in its entirety while maintaining cost and keeping operational efficiency in check. If the answer is yes, start with the rehost method.Boxes of various shapes with text written inside If rehosting doesn’t seem like a fit for you or if cost-efficiency is at a level that needs to be refined, you can also consider re-platforming as a good option. Remember that not all apps can be transitioned this way, so you may end up having to find other solutions entirely.

The same approach goes for refactoring. If you have enough time and resources to complete a full refactor of your current solutions, then take SaaS and other alternate solutions into consideration. 

Nevertheless, you can certainly take most of the hassle out of moving to the cloud with the right cloud migration strategy. You can then devote yourself to finding new resources to use, better flexibility to benefit from, and a more effective environment for your apps. 

Take account of these points in mind, and you’ll be able to find the best approach out of these. However, there is no defined path to success. Your organization needs may vary and delve you into adopting a combination of these approaches, i.e. hybrid approach.

For example, it is possible that after conducting a migration analysis for your organization, it is determined that:

  • 50% of your apps need to be re-hosted
  • 10% to be retained on-premises in a colocation facility
  • 40% apps, which are maintenance-level yet business-critical, are flagged for re-platforming/refactoring

What is important in the end is to plan and roll out your migration plan by conducting a thorough analysis of your complete IT system, your infrastructure, and your application workload. 

This assessment will help you determine which strategy to use and which part(s) should be moved to the cloud. 

Topics: AWS, Cloud, Javascript

Exploring How AWS Serverless Web Apps Work

Posted by Kimi Mahajan on Sep 7, 2019 4:10:00 PM

2014 saw a breakthrough release of AWS Lambda offering a powerful new way of running applications on the cloud. However, soon it was realised that a structure is needed to run an application, and it’s difficult to manage all of the containers that lambda introduces.

This gave way to the most powerful framework for building applications exclusively on AWS lambda which was called Serverless framework.

With increasing number of organizations riding the wave of serverless, the way they develop and deliver software has undergone drastic transformation. Let’s get to know the details of serverless web apps and explore how they work.

The What and Why of Serverless

Serverless refers to an application framework for building web applications without going into the detailing of servers. The servers are managed by cloud provider, taking care of its provisioning and allocation.

This makes the application to run in a stateless compute containers that are ephemeral and event-triggered. The productive efforts of developers can be channeled in the right direction, and saves their time in getting caught up in intricate web of modern complex infrastructure.

Pricing is based on pay-per-use model rather and not pre-purchased compute capacity. Serverless has the most realistic offerings, and is anticipated to be one of the most used cloud services in upcoming years.

We have already mentioned in our blog - 5 Reasons To Consider Serverless AWS For Drupal - why AWS is considered the best hosting provider for Drupal websites.

If we compare the architecture for a multi-user, mobile friendly app which requires user authentication of a monolithic with that of a serverless web app, it would look somewhat like the one shown below:

monolithic architectureMonolithic Architecture (Source: Dzone)

The serverless architecture would look something like the one shown below:

serverless architecture

Serverless Architecture (Source: Dzone)

With serverless, application development is dependent on a combination of third-party services, client-side logic and cloud-hosted remote procedure calls, and is hence referred to as Functions as a Service.

FaaS refers to an implementation of serverless architecture where a piece of business logic processes individual requests. It is an independent, server-side, logical functions which are small, separate, units of logic that take input arguments, operate on them and return the result, such as lambda. It is stateless, which means, any two invocations of the same function could run on completely different containers.

AWS Lambda, Azure Functions, IBM OpenWhisk and Google Cloud Functions are most well-known FaaS solutions available, supporting a range of languages and runtimes e.g. Node.js, Python, etc.


Source: Slideshare

Composition of Serverless App

Assembling a modern application implies you’re creating a solution by combining SaaS with a managed/serverless service. This makes the process faster, but complex at the same time, as the process requires a lot of manual work to bring all the pieces together.

However, with serverless components, it becomes quite simpler. Every project uses AWS resources and divides these resources into three groups:

  • AWS Lambdas
  • AWS API Gateway REST API
  • AWS other resources such as DynamoDB tables, S3 Buckets, etc.

Serverless projects live exclusively on cloud i.e. AWS and don’t have specific environment. Serverless isolates the AWS resources a project uses for development, testing and production purposes through stages.

Stages can be thought of as environments, except for the fact that they exist merely to separate and isolate your project's AWS resources.

Each serverless project can have multiple stages, and each stage can have multiple AWS Regions.

  • AWS Lambdas

Lambda functions on AWS can be replicated across each region your project uses. When you include a function in your project to a stage, it deploys a lambda. It can be triggered by events from other AWS services but not from direct HTTP requests.

  • AWS API Gateway REST API

If your functions have endpoint data, a REST API on AWS API Gateway will automatically be created for your project which can be replicated across each region.

When you deploy an endpoint to a stage, it builds it on your API Gateway REST API and then creates a deployment in that API Gateway stage.

  • AWS Other Resources

Your project's other AWS resources have separate deployments for each stage which can be replicated across each region your project uses.

Components presents a single experience for you to provision infrastructure and code across all cloud and SaaS vendors saving development time.

Creating a Serverless Application

Let’s take a look at how you can use serverless components to create an entire serverless application. A Serverless solution consists of a web server, Lambda functions (FaaS), security token service (STS), user authentication and database.

Creating a Serverless Application


  • Client Application :  The UI of your application is rendered client side.
  • Web Server: Amazon S3 acts as a robust and simple web server which can serve the static HTML, CSS and JS files for our application.
  • Lambda function (FaaS): It is the key enabler in a serverless architecture. AWS Lambda is used in the above shown framework for logging in and accessing data to read and write from your database and provide JSON responses.
  • Security Token Service (STS) :  STS generates temporary AWS credentials (API key and secret key) for users of the application to invoke the AWS API (and thus invoke Lambda).
  • User Authentication : User login can be added to mobile and web apps by an identity service integrated with AWS Lambda. Not only this, it helps in authenticating users through social identity providers with SAML identity solutions.
  • Database :  AWS DynamoDB provides a fully managed NoSQL database. DynamoDB is used as an example here.

Any cloud service can be packaged as a serverless component.

Understanding Through An Example

You want to write a serverless image processing API that pulls images from S3. To do so, you might create an AWS API Gateway endpoint to call an AWS Lambda function, which then pulls an image from the AWS S3 bucket and modifies it.

All serverless components can be nested in a larger component to create a serverless image processing API, as shown in the image below:You want to write a serverless image processing API that pulls images from S3. To do so, you might create an AWS API Gateway endpoint to call an AWS Lambda function, which then pulls an image from the AWS S3 bucket and modifies it.

Here’s why this is important: when you create this image processing API, you’ll configure each component. However, this can be avoided by nesting those infrastructure-level components in a higher-order component, which can expose simpler configuration, and can be used somewhere else.

Composing components to form an entire application

The complete web application can be built by nesting the serverless components.


Serverless framework believes in making infrastructure more invisible, enhancing developers’ ability to focus on outcomes, and fosters a community to share and reuse outcomes.7-684376835498614167

CloudFormation is a service that AWS offers as a unit with all the necessary pieces to make a Lambda do any actual work. It treats a complete serverless ‘stack’ as a configuration file that can be moved and deployed in different environments.

How do we tell our Lambda where its running, and how do we give it the configuration that it needs to interact with other services?

We need secrets to authenticate to our DB, but we also need our Lambda to know that it’s running on staging so that it doesn’t try to update the production database during our test runs.

So we can identify three key sections of our serverless app: our function, its resources, and the secrets/configuration that make up its environment.

In a highly virtualized environment it can be difficult to point where is a particular code running. The ‘environment’, stack, its configuration, and secrets will collectively exist across multiple zones or even in services.

An active stack refers to a complete set of functions, resources, and environment. Dev, test and prod can be the three active stacks where you’re running your code. If these production stacks are distributed to three different AWS regions, you again have three active stacks.

It is important to build a plan to manage all these features so as to adopt a serverless model for part of our architecture. You must have:

  • Programmers to write functions and also manage their source code
  • Cloud professionals to manage the resources those functions need
  • Operations and security to deploy these stacks in the right environments

Srijan can help you assemble matching stacks and their environments, and easily define complete applications and re-deploy them in different AWS regions. Contact our experts with your requirements.

Topics: Microservices, AWS, Cloud

Setting up a Data Lake architecture with AWS

Posted by Gaurav Mishra on Aug 27, 2019 5:50:00 PM

We’ve talked quite a bit about data lakes in the past couple of blogs. We looked at what is a data lake, data lake implementation, and addressing the whole data lake vs. data warehouse question. And now that we have established why data lakes are crucial for enterprises, let’s take a look at a typical data lake architecture, and how to build one with AWS. 

Before we get down to the brass tacks, it’s helpful to quickly list out what the specific benefits that we want an ideal data lake to deliver. These would be:

  • The ability to collect any form of data, from anywhere within an enterprise’s numerous data sources and silos. From revenue numbers to social media streams, and anything in between.
  • Reduce the effort needed to analyze or process the same data set for different purposes by different applications.
  • Keep the whole operation cost efficient, with the ability to scale up storage and compute capacities as required, and independent of each other.


And with those requirements in mind, let’s see how to set up a data lake with AWS

Data Lake Architecture

A typical data lake architecture is designed to:

  • take data from a variety of sources
  • move them through some sort of processing layer
  • make it available for consumption by different personas within the enterprise

So here, we have some key part of the architecture to consider:

Landing zone: This is the area where all the raw data comes in, from all the different sources within the enterprise. The zone is strictly meant doe data ingestion, and no modelling or extraction should be done at this stage.

Curation zone: Here’s where you get to play with the data. The entire extract-transform-load (ETL) process takes place at this stage, where the data is crawled to understand what it is and how it might be useful. The creation of metadata, or applying different modelling techniques to it to find potential uses, is all done here.

Production zone: This is where your data is ready to be consumed into different application, or to be accessed by different personas. 

Data Lake architecture with AWS

With our basic zones in place, let’s take a look at how to create a complete data lake architecture with the right AWS solutions. Throughout the rest of this post, we’ll try to bring in as many of AWS products as applicable in any scenario, but focus on a few key ones that we think brings the best results. 

Landing Zone - Data Ingestion & Storage

For this zone, let’s first look at the available methods for data ingestion:

  • Amazon Direct Connect: Establish a dedicated connect between your premises or data centre and the AWS cloud for secure data ingestion. With an industry standard 802.1q VLAN, the Amazon Direct Connect offers a more consistent network connection for transmitting data from your on premise systems to your data lake.
  • S3 Accelerator: Another quick way to enable data ingestion into an S3 bucket is to use the Amazon S3 Transfer Acceleration. With this, your data gets transferred to any of the globally spread out edge locations, and then routed to your S3 bucket via an optimized and secure pathway. 
  • AWS Snowball: You can securely transfer huge volumes of data onto the AWS cloud with AWS Snowball. It’s designed for large-scale data transport and is one-fifth of the cost of transferring data via high-speed internet. It’s a great option for transferring voluminous data assets like genomics, analytics, image or video repositories.
  • Amazon Kinesis: Equipped to handle massive amounts of streaming data, Amazon Kinesis can ingest, process and analyze real-time data streams. The entire infrastructure is managed by AWS to that it’s highly efficient and cost-effective. You have:
    • Kinesis Data Streams: Ingest real-time data streams into AWS from different sources and create arbitrary binary data streams that are on multiple availability zones by default.
    • Kinesis Firehose: You can capture, transform, and quickly load data onto Amazon S3, RedShift, or ElasticSearch with Kinesis Firehose. The AWS managed system autoscales to match your data throughput, and can batch, process and encrypt data to minimize storage costs.
    • Kinesis Data Analytics: One of the easiest ways to analyze streaming data, Kinesis Data Analytics pick any streaming source, analyze it, and push it out to another data stream or Firehose.

Storage - Amazon S3

One of the most widely used cloud storage solution, the Amazon S3 is perfect for data storage in the landing zone. S3 is a region level, multi availability zone storage options. It’s a highly scalable object storage solution offering 99.999999999% durability. 

But capacity aside, the Amazon S3 is suitable for a data lake because it allows you to set a lifecycle for data to move through different storage classes. 

  • Amazon S3 Standard to store hot data that is being immediately used across different enterprise applications
  • Amazon S3 Infrequent Access to hold warm data, that accessed less across the enterprise but needs to be accessed rapidly whenever required.
  • Amazon S3 Glacier to archive cold data at a very low cost as compared to on premise storage.

Curation Zone - Catalogue and Search

Because information in the data lake is in the raw format, it can be queried and utilized for multiple different purposes, by different applications. But to make that possible, usable metadata that reflects technical and business meaning also has to be stored alongside the data. This means you need to have a process to extract metadata, and properly catalogue it.

The meta contains information on the data format, security classification-sensitive, confidential etc, additional tags-source of origin, department, ownership and more. This allows different applications, and even data scientists running statistical models, to know what is being stored in the data lake.

Data Lake Architecture - polulating metadata

Source: Screengrab from "Building Data Lake on AWS", Amazon Web Services, Youtube

The typical cataloguing process involves lambda functions written to extract metadata, which get triggered every time object enters Amazon S3. This metadata is stored in a SQL database and uploaded to AWS ElasticSearch to make it available for search.

AWS Glue is an Amazon solution that can manage this data cataloguing process and automate the extract-transform-load (ETL) pipeline. The solutions runs on Apache Spark and maintains Hive compatible metadata stores. Here’s how it works:

  • Define crawlers to scan data coming into S3 and populate the metadata catalog. You can schedule this scanning at a set frequency or to trigger at every event
  • Define the ETL pipeline and AWS Glue with generate the ETL code on Python
  • Once the ETL job is set up, AWS Glue manages its running on a Spark cluster infrastructure, and you are charged only when the job runs.

The AWS Glue catalog lives outside your data processing engines, and keeps the metadata decoupled. So different processing engines can simultaneously query the metadata for their different individual use cases. The metadata can be exposed with an API layer using API Gateway and route all catalog queries through it.

Curation Zone - Processing

Once cataloging is done, we can look at data processing solutions, which can be different based on what different stakeholders want from the data.

Amazon Elastic MapReduce (EMR)

Amazon’s EMR is a managed Hadoop cluster that can process a large amount of data at low cost.

A typical data processing involves setting up a Hadoop cluster on EC2, set up data and processing layers, setting up a VM infrastructure and more. However, this entire process can be easily handled by EMR. Once configured, you can spin up new Hadoop clusters in minutes. You can point them to any S3 to start processing, and the cluster can disappear once the job is done. 

Data Lake Architecture - Amazon EMR Benefits-1

Source: Screengrab from "Building Data Lake on AWS", Amazon Web Services, Youtube

The primary benefit of processing with EMR rather than Hadoop on EC2 is the cost savings. With the latter, your data lies within the Hadoop processing cluster, which means the cluster needs to be up even when the processing job is done. So you are still paying for it. However with EMR, your data and processing layers are decoupled, allowing you to scale them both independent of each other. So while your data resides in S3, your Hadoop clusters on EMR can be set up and stopped as required, making the cost of processing completely flexible. Costs are also lowered by easily integrating it with Amazon spot instances for lower pricing.

Amazon ElasticSearch

This is another scalable managed search node cluster that can be easily integrated with other AWS services. It’s best for log analytics use cases. 

Amazon RedShift

If you have a lot of BI dashboards and applications, Amazon RedShift is a great processing solution. It’s inexpensive, fully managed, and ensures security and compliance. With RedShift, you can spin up a cluster of compute nodes to simultaneously process queries.

This processing stage is also where enterprises can set up their sandbox. They can open up  the data lake to data scientists to run preliminary experiments. Because data collection and acquisition is now taken care of, data scientists can focus on finding innovative ways to put the raw data to use. They can bring is open-source or commercial analytics tools to create required test beds, and work on creating new analytics models aligned with different business use cases.

Production Zone - Serve Processed Data

With processing, data lake is now ready to push out data to all necessary applications and stakeholders. So you can have data going out to legacy applications, data warehouses, BI applications and dashboards. This can be accessed by analysts, data scientists, business users, and other automation and engagement platforms.

So there you have it, a complete data lake architecture and how it can be set with the best-of-breed AWS solutions. 

Looking to set up an optimal data lake infrastructure? Talk to our expert AWS team, and let’s find out how Srijan can help.

Topics: AWS, Architecture

Integrating Drupal with AWS Machine Learning

Posted by Kimi Mahajan on Aug 23, 2019 11:24:00 AM

With enterprises looking for ways to stay ahead of the curve in the growing digital age, machine learning is providing them with the needed boost for seamless digital customer experience.

Machine learning algorithms can transform your Drupal website into an interactive CMS and can come up with relevant service recommendations targeting each individual customer needs by understanding their behavioural pattern.

Machine Learning integrated Drupal website ensures effortless content management and publishing, better targeting and empowering your enterprise to craft personalized experiences for your customers. It automates the customer service tasks and frees up your customer support teams, subsequently impacting RoI.

However, with various big names competing in the market, let’s look at how Amazon’s Machine Learning stands out amongst all and provides customised offerings by integrating with Drupal.

Benefits of Integrating AWS Machine Learning with Drupal

AWS offers the widest set of machine learning services ranging from pre-trained AI services for computer vision, language, recommendations, and forecasting. These capabilities are built on the most comprehensive cloud platform and are optimized without compromising security. Let’s look at the host of advantages it offers when integrated with Drupal.

Search Functionality

One of the major problems encountered while searching on a website is the usage of exact keyword. If the content uses a related keyword, you will not be able to find it without using the correct keyword.

This problem can be solved by using machine learning to train the search algorithm to look for synonyms and display related results. The search functionality can also be improved by using automatically filtering as per past reads, the search results according to the past reads, click-through rate, etc.

Amazon Cloudsearch is designed to help users improve the search capabilities of their applications and services by setting up a scalable search domain solution with low latency and to handle high throughput.

Image Captioning

Amazon Machine Learning helps in automatic generation of related captions for all images on the website by analyzing the image content. The admin would have the right to configure whether the captions should be added automatically or after manual approval, saving a lot of time for the content curators and administrators of the website.

Amazon Rekognition helps search several images to find content within them and easily helps segregate them almost effortlessly with minimal human interaction.

Website Personalization

Machine learning ensures users get to view tailored content on websites as per their favorite reads and searches by assigning them unique identifier (UID) and tracking their behaviour (clicks, searches, favourite reads etc) on the website for personalized web experience.

Machine learning analyzes the data connected with the user’s UID and provides personalized website content.

Amazon Personalize is a machine learning service which makes it easy for developers to create individualized recommendations for its customers. It saves upto 60% of the time needed to set up and tune the infrastructure for the machine learning models as compared to setting own environment.

Another natural language processing (NLP) service that uses machine learning to find insights and relationships in text is Amazon Comprehend. It easily finds out which topics are the most popular on the internet for easy recommendation. So, when you’re trying to add tags to an article, instead of searching through all possible options, it allows you to see suggested tags that sync up with the topic.

Vulnerability Scanning

A website is always exposed to potential threats, with a risk to lose customer confidential data.

Using machine learning, Drupal based websites can be made secure and immune to data loss by automatically scanning themselves for any vulnerabilities and notifying the administrator about them. This gives a great advantage to websites and also help them save the extra cost spent on using external software for this purpose.

Amazon Inspector is an automated security assessment service, which helps improve the security and compliance of the website deployed on AWS and assesses it for exposure, vulnerabilities, and deviations from best practices.

Voice-Based Operations

With machine learning, it’s possible to control and navigate your website by using your voice. With Drupal standing by its commitment towards accessibility, when integrated with Amazon Machine Learning features, it promotes inclusion to make web content more accessible to people.

Amazon Transcribe is an automatic speech recognition (ASR) service. When integrated with a Drupal website, it benefits the media industry with live subtitling of news or shows, video game companies by streaming transcription to help hearing-impaired players, enables stenography in courtrooms in legal domain, helps lawyers to make legal annotations on top of live transcripts, and enables business productivity by leveraging real-time transcription to capture meeting notes.

The future of websites looks interesting and is predicted to benefit users through seamless experience by data and behavior analysis. The benefits of integrating Amazon Machine Learning with Drupal will clearly give it a greater advantage over other CMSs and will pave the way for a brighter future and better roadmap.

Srijan has certified AWS professionals and an expertise in AWS competencies. Contact us to get started with the conversation.

Topics: Drupal, AWS, Machine Learning & AI, Planet Drupal

5 Reasons To Consider Serverless AWS For Drupal

Posted by Kimi Mahajan on Aug 5, 2019 1:01:00 PM

Using cloud is about leveraging its agility among other benefits. For the Drupal-powered website, a right service provider can impact how well the website performs and can affect the business revenue.

A robust server infrastructure, such as AWS, when backs up the most advanced CMS, Drupal, it proves to accelerate the website’s performance, security and availability.

But why AWS and what benefits does it offer over others? Let’s deep dive to understand how it proves to be the best solution for hosting your Drupal websites.

Points To Consider For Hosting Drupal Websites

The following are the points to keep in mind while considering providers for hosting your pure or headless Drupal website.

Better Server Infrastructure: Drupal specialised cloud hosting provider should offer a server infrastructure that is specifically optimized for running Drupal websites in a way they were designed to run.

Better Speed: It should help optimise the Drupal website to run faster and should have the ability to use caching tools such as memcache, varnish, etc.

Better Support: The provider should offer better hosting support with the right knowledge of a Drupal website.

Better Security and Compatibility: The hosting provider should be able to provide security notifications, server-wide security patches, and even pre-emptive server upgrades to handle nuances in upcoming Drupal versions.

Why not a traditional server method?

There are two ways of hosting Drupal website via traditional server setups: 

  • a shared hosting server, where multiple websites run on the same server
  • or  a dedicated Virtual Private Server (VPS) per website. 

However, there are disadvantages to this approach, which are:

  1. With a lot of non-redundant single-instance services running on the same server, there are chances if any component crashes, the entire site can get offline.
  2. Being non-scalable, this server does not scale up or down automatically and requires a manual intervention to make changes to the hardware configuration and may cause server to go down due to an unexpected traffic boost.
  3. The setup constantly runs at full power, irrespective of usage, causing wastage of resources and money.

Hosting Drupal on AWS

Amazon Web Services (AWS) is a pioneer of cloud hosting industry providing hi-tech server infrastructure and is proved to be highly secure and reliable.
With serverless computing, developers can focus on their core product instead of worrying about managing and operating servers or runtimes, either in the cloud or on-premises. It eliminates infrastructure management tasks such as server or cluster provisioning, patching, operating system maintenance, and capacity provisioning. It enables you to build modern applications with increased agility and lower total cost of ownership and time-to-market.

With serverless being the fastest-growing trend with an annual growth rate of 75% and foreseen to be adopted at a much higher rate, let’s understand the significance of the AWS components in the Virtual Private Cloud (VPC). Each of these components proves it to be the right choice for hosting pure or headless websites.

AWS-syanmic-content-srijan-technologiesArchitecture diagram showcasing Drupal hosting on AWS


  •  Restrict connection: NAT Gateway

Network Address Translation (NAT) gateway enables instances in a private subnet to connect to the internet or other AWS services. Hence, the private instances in the private subnet are not exposed via the Internet gateway, instead, all the traffic is routed via the NAT gateway. 

The gateway ensures that the site will always remain up and running. AWS takes over the responsibility of its maintenance.


  • Restrict access: Bastion Host

Bastion hosts protects the system by restricting access to backend systems in protected or sensitive network segments. Its benefit is that it minimises the chances of any potential security attack.


  • Database: AWS Aurora

The Aurora database provides invaluable reliability and scalability, better performance and response times. With fast failover capabilities and storage durability, it minimizes technical obstacles.


  • Upload content: Amazon S3

With Amazon S3,  store, retrieve and protect any amount of data at any time in a scalable storage bucket. Recover lost data easily, pay for the storage you actually use, protect data from unauthorized use and easily upload and download your data with SSL encryption.


  • Memcached/Redis: AWS Elasticache

Elasticache is a web service that makes it easy to set up, manage, and scale a distributed in-memory data store in the cloud.


  • Edge Caching: AWS CloudFront

CloudFront is an AWS content delivery network which provides a globally-distributed network of proxy servers which cache content locally to consumers, to improve access speed for downloading the content.


  • Web servers: Amazon EC2

Amazon EC2 is a web service that provides secure, resizable compute capacity in the cloud.


  • Route 53

Amazon Route 53 effectively connects user requests to infrastructure running in AWS and can also be used toroute users to infrastructure outside of AWS.

Benefits of Hosting Drupal Website on AWS

Let’s look at the advantages of AWS for hosting pure or headless Drupal websites.
High Performing Hosting Environment

The kind of performance you want from your server depends upon the type of Drupal website you are building. A simple website with a decent amount of traffic can work well on a limited shared host platform. However, for a fairly complex interactive Drupal site, a typical shared hosting solution might not be feasible. 

Instead, opt for AWS, which provides a server facility and you get billed as per your usage.

Improved Access To Server Environment 

Shared hosting environment restricts its users to gain full control and put a limitation on their ability to change configurations for Apache or PHP, and there might be caps on bandwidth and file storage. These limitations get removed when you're willing to pay a higher premium for advanced level access and hosting services.

This is not true with AWS, which gives you direct control over your server instances, with permissions to SSH or use its interface control panel to adjust settings.

Control over the infrastructure 

Infrastructure needs might not remain constant and are bound to change with time. Adding or removing hosting resources might prove to be difficult or not even possible and would end up paying for the unused resources.

However, opting for AWS will let you pay for the services you use and can shut them off easily if you don’t need them anymore. On-demand virtual hosting and a wide variety of services and hardware types make AWS convenient for anyone and everyone.

No Long-term Commitments

If you are hosting a website to gauge the performance and responsiveness, you probably would not want to allocate a whole bunch of machines and resources for a testing project which might be over within a week or so.

The convenience of AWS on-demand instances means that you can spin up a new server in a matter of minutes, and shut it down (without any further financial cost) in just as much time.

Refrain Physical Hardware Maintenance

The advantage of using virtual resources is to avoid having to buy and maintain physical hardware. 

Going with virtually hosted servers with AWS helps you focus on your core competency - creating Drupal websites and frees you up from dealing with data center operations.

Why Choose Srijan?

Srijan’s team of AWS professionals can help you migrate your website on AWS cloud. With an expertise in enhancing Drupal-optimised hosting environment utilising AWS for a reliable enterprise-level hosting, we can help you implement various AWS capabilities as per your enterprises’ requirements. Drop us a line and let our experts explore how you can get the best of AWS.

Topics: Drupal, AWS, Cloud

AWS - The Right Cloud for Media and Entertainment Workloads

Posted by Kimi Mahajan on Aug 2, 2019 2:32:00 PM

The media landscape is transforming the way content is being produced and consumed, which is giving rise to user expectations to have more personalized experiences, from anywhere, anytime and on any device. 

This is leading to huge operational changes for media companies to migrate from traditional broadcasting method to digital distribution model. Several media giants are increasingly adopting new cloud technologies to manage the explosive growth of the digital content.

Media enterprises are making a shift to AWS, which is the pioneer in cloud hosting, to take advantage of its high scalability, elasticity and secure cloud services. 

But, how beneficial is AWS in terms of solving challenges of media and entertainment industry? Let’s understand the benefits of moving to cloud and why AWS offers the best services in the cloud service arena.

Why Media Enterprises need to Shift to Cloud?

In a survey, 35% of respondents replied that their enterprises moved to cloud for easier collaboration for post-production tasks.

types-of-business-responding-srijan-technologiesSource: Backblaze

The constant pressure among media firms to invest resources in generating high-quality, creative content and the need to prevent data losses due to natural and artificial catastrophes is pushing them to move to cloud.

So, how is cloud helping the Media and entertainment industry with its major challenges? Let’s review them one by one.

1.Huge Consumer Demand

Today’s consumers of media and entertainment content expect huge content choice, with their demand varying rapidly, which have to be dealt with in real time. 

The media and entertainment sector needs to cost-effectively meet volatile demand, and remain flexible in terms of automatically spinning servers up and down as demand increases or decreases.

2. Continuous Supply of Content

In order to stay competitive, content creators in the media field are under constant pressure to produce and/or distribute original content more frequently, at an accelerated rate.

With cloud, it’s easier to store, manage, and deliver gigantic amount of digital content. Hybrid and multi-cloud deployments can provide an even greater measure of flexibility, allowing workloads to be shifted seamlessly across public and private infrastructures.

3. Cost Benefits of Cloud Computing

Cable or broadcast television section of media and entertainment sector are being challenged by new trends in television broadcasting. Agile and low-cost over the top (OTT) companies selling and/or delivering streaming media content directly to consumers over the Internet are competing against the traditional media distribution methods.

Other factors that are challenging media content are the rising costs of content licensing, as well as shortened technology lifecycles.

By shifting to the cloud’s OPEX model, media companies can reduce their costs involving storage and delivery technologies and infrastructures.

4. High Performance With Minimal to Zero Delays

It is critical in terms of user experience for viewer content to stream with minimal delays and downtime. A six-second delay in streaming an ad for a show can cost a huge loss, with customers likely to switch to another entertainment channel.

The cloud provides architectures which supports high availability and un-compromised performance SLAs.

Advantages of AWS for Media Enterprises

Media enterprises can help their users monitor, manage storage and compute usage and costs with the tools and services with AWS. 

For major tasks around content production, storage, processing, and distribution, AWS brings scalable, elastic and secured cloud services. Equipped with deep learning, NLP, ML, NLU it delights the digital media creators with personalized experiences through smarter content investments.

Secure, Scalable and Cost-Effective Solution

66% of respondents say security is their greatest concern while adopting an enterprise cloud computing platform

AWS remains the best choice for media companies who are looking to adopt private cloud model. As per Cloud Security Alliance report, Amazon Web Services is the most popular public cloud infrastructure platform, comprising 41.5% of application workloads in the public cloud. 

Multinational entertainment firms have become scalable and are also making content available to consumers anytime and from anywhere on leveraging AWS cloud services.

It remains a cost-effective solution for media enterprises which can follow pay per use model for the services leveraged.

Cloud Computing is Changing Economics of Media and Publishing

Simplified Content creation and Production

Media enterprises need not worry about geo and resource constraints. The only focus  should be on creating quality content with HDR, VR, AR and beyond, to keep viewers engaged. 

With AWS, you can connect with world-wide production talent, unlimited capacity, unsurpassed security and the most innovative cloud technology partners in the industry. 

Now, you can optimize valuable insights to improve production investment decisions tailored as per consumers’ needs with the help of machine learning and analytics. Pre-processing and optimization for false takes or cuts comes easy with AWS. ML production edit provides quick turn-around for dailies and editorial review. Prohibited content can be easily flagged for filtered viewing.

Efficient Storage Provider

The media enterprises now have a one-stop solution for their storage concerns by opting for AWS multi-tiered storage solution, which includes Amazon Simple Storage Service (Amazon S3), S3 Infrequent access, and Amazon Glacier. These solutions allow for massive data storage, allowing huge data ingestion and elasticity satisfying the ever-increasing demand for storage, along with cost management.

Easens Digital Distribution and Post Production Process

AWS can solve the concerns of broadcasting quality video workflows in the cloud and ensures seamless delivery to any device, at anytime and anywhere.

Media enterprises need not worry about live, linear, and on-demand content, as AWS  specialises in delivering and in creating professional quality media experiences for the viewers in much less time, effort and expenses required in a traditional data center.

Pay-as-you-go pricing and fully-automated resource scaling lets you handle any sized audience without upfront capital investment and instead of managing complex infrastructure, AWS video solutions lets you focus on creating user-engaging content.

Live Streaming, Subtitling, Video on Demand Service

Making content understandable to a large audience is easy with AWS cloud solutions which helps generate multilingual subtitles for live over-the-top streaming. 

With AWS, the viewers will be given a choice to choose a movie/video from a wide array of options with the help of video-on-demand (VOD) content. VOD can be available for broadcast and multi-screen delivery.

Migration of VFX renderings to AWS will help media companies to shorten content production times and foster collaboration with contributors from around the world. 

Let’s understand how AWS has been beneficial for giant names in media and entertainment.

Company Description Business Challenges Solution and Benefits


Prominent name in streaming online content on smart TV, game console, PC, Mac, mobile, tablet and more.

  • Unable to scale
  • Unable to meet user increased demand
  • Huge infrastructure unable to manage data storage
  • Accelerated deployment of  servers and data storage
  • Stream high-quality content from anywhere, any device
  • improved scalability, with a better architecture
  • Containers optimized their microservices architecture

Discovery Communications

Leader in nonfiction media, reaching more than 1.8 billion cumulative subscribers in 218 countries and territories.

  • Required easy to manage website infrastructure
  • Was seeking cost-effective solution
  • Wanted to consolidate multiple delivery engines
  • Needed scalable and flexible solution
  • Wanted to switch to pay-as-you-go model
  • Migrated more than 40 sites to AWS
  • Highly scalable architecture
  • Entire continuous delivery system and development platform built around AWS API
  • Low latency along with cost savings of 20-25 percent and better manageability


Media and entertainment have begun to embrace cloud computing as their technology of choice. Reducing IT operational costs and providing anytime and anywhere accessible high quality content will soon trigger global adoption of cloud solutions by media and entertainment.

Srijan is an AWS Advanced Consulting Partner. Contact us today to discuss how our AWS trained professionals can help you in migrating your media and entertainment-based apps to AWS.

Topics: AWS, Cloud, Media & Publishing

How to conduct AWS cost optimization of your workload

Posted by Gaurav Mishra on Jul 30, 2019 12:04:00 PM

Your enterprise operates on the consumption-based model of AWS, but is your set up cost fully optimized? Are you able to best utilize your resources, achieve an outcome at the lowest possible price point, and meet your functional requirements?

If not, you are underutilizing the capabilities of your AWS cloud.

AWS offers several services and pricing options that can give your enterprise the flexibility to manage both your costs as well as keep the performance at par. And while it is relatively easy to optimize your costs in small environments, to scale successfully across large enterprises you need to follow certain operational best practices, and process automation.

Here’s taking a look at the six AWS cost optimization pillars to follow regardless of your workload or architecture:

Right size your services

AWS gives you the flexibility to adapt your services to meet your current business requirements. It also allows you to shift to the new services option when your demands change, to address new business needs anytime, without penalties or incidental fees.

Thus, through right sizing, you can:

  • use the lowest cost resource that still meets the technical specifications of a specific workload

  • adjust the size of your resources to optimize for costs

  • meet the exact capacity requirements you have without having to overprovision or compromise capacity. This allows you to optimize your AWS workload costs.

Amazon CloudWatch and Amazon CloudWatch Logs are key AWS services that support a right-sizing approach, and allow you to set up monitoring in order to understand your resource utilization.

Appropriately provisioned 

AWS Cloud gives you the ability to modify the attributes of your AWS managed services, in order to ensure there is sufficient capacity to meet your needs. You can turn off resources when they are not being used, and provision systems based on the requirements of your service capacity.

As a result, your excess capacity is kept to a minimum and performance is maximized for end users. This also helps optimize costs to meet your dynamic needs.

AWS Trusted Advisor helps monitors services such as Amazon Redshift and Amazon RDS for resource utilization and active connections. While the AWS Management Console can modify attributes of AWS services, and help align resource needs with changing demand. Amazon CloudWatch is also a key AWS service that supports an appropriately provisioned approach, by enabling you to collect and track metrics of usage.

Leverage the right pricing model

AWS provides a range of pricing models: On-Demand and Spot Instances for variable workloads, and Reserved Instances for predictable workloads. You can choose the right pricing model as per the nature of your workload to optimize your costs.

1. On Demand Instances

In On-Demand Instances, you pay for compute capacity by per hour or per second depending on which instances you run. No long-term commitments or upfront payments are needed. These instances are recommended for applications with short-term or predictable workloads that cannot be interrupted.

For example, in using resources like DynamoDB on demand, there is just the flat hourly rate, and no long-term commitments.

2. Spot Instances

A Spot Instance is an unused EC2 instance that you can bid for. Once your bid exceeds the current spot price (which fluctuates in real time based on demand-and-supply) the instance is launched. The instance can go away anytime the spot price becomes greater than your bid price.

Spot Instances are often available at a discount, and using it can lower your operating costs by up to 90% compared to On-Demand instances. They are ideal for use cases like batch processing, scientific research, image or video processing, financial analysis, and testing.

3. Reserved Instances

Reserved Instances enable you to commit to a period of usage (one or three years) and
save up to 75% over equivalent On-Demand hourly rates. They also provide significantly more savings than On-Demand Instances on applications with predictable usage, without requiring a change to your workload.

AWS Cost Explorer is a free tool to analyze your costs, and identify your expenses on AWS resources, areas that need further analysis, and see trends that can provide a better understanding of your costs.

Geographic selection

Another best practice to architect your solutions is to place your computing resource close to your users. This ensures lower latency, strong data sovereignty and minimizes your costs.

Every AWS region operates within local market conditions, with resource pricing different for each region. It is up to you to make the right geographic selection so that you can run at the lowest possible price globally.

AWS Simple Monthly Calculator can help you estimate the cost to architect your solution in various regions around the world and compare the cost of each. Simultaneously, using AWS CloudFormation or AWS CodeDeploy can help you provision a proof of concept environment in different regions, run workloads, and analyze the exact and complete system costs for each region.

Managed services

Using AWS managed services will not only help remove much of your administrative and operational overheads, but also reduce the cost of managing your infrastructure. Since they operate at cloud scale, the cost per transaction or service is efficiently lowered. And using managed services also helps you save on the license costs.

AWS database, Amazon RDS, Amazon DynamoDB, Amazon Elasticsearch Service, and Amazon EMR are some of the key AWS services that support a managed approach. These services reduce the cost of capabilities and also free up time for your developers and administrators.

Optimize data transfer

Lastly, architecting for a data transfer can help you optimize costs. This involves using content delivery networks to locate data closer to users (effectively done Amazon CloudFront), or using dedicated network links from your premises to AWS (as done by AWS Direct Connect).

Using AWS Direct Connect can help reduce network costs, increase bandwidth, and provide a more consistent network experience than internet based connections.

Starting with these best practices early in your journey will help you establish the right processes and ensure success when you hit scale.

AWS provides a set of cost management tools out of the box to help you manage, monitor, and, ultimately, optimize your costs. Srijan’s is an AWS Advanced Consulting Partner, with AWS certified teams that have the experience of working with a range of AWS products and delivering cost-effective solutions to global enterprises.

Ready to build cloud-native applications with AWS? Just drop us a line and our expert team will be in touch.

Topics: AWS, Cloud


Write to us

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms Of Service apply. By submitting this form, you agree to our Privacy Policy.

See how our uniquely collaborative work style, can help you redesign your business.

Contact us