Recent Posts

Should Your Enterprise Go For a Headless Commerce Architecture?

Posted by Nilanjana on Oct 25, 2019 10:03:25 PM

E-commerce enterprises looking to deliver seamless user experiences, often wonder how to do so. Is there a way that does not require them to invent their own IoT device or build back-end solutions from scratch?

Enter Headless Commerce.

An extension of the headless content management system, headless commerce offers you the capabilities to build customized user experiences across channels, paving the way for omnichannel retail. 

Here’s a deep dive into discovering all about headless commerce, and whether it is the right choice for your business.

What is a Headless Commerce Architecture?

A headless commerce architecture is the separation of the frontend of your e-commerce experience from the backend. Doing so allows for a greater architectural flexibility, allowing your front end developers to solely focus on customer interactions, without worrying about the impact on critical backend systems.

At the same time, it leaves your backend developers free to use APIs to deliver things like products, blog posts or customer reviews to any screen or device.

The headless commerce architecture entirely separates the presentation layer of your store from the business-critical processes like order and inventory management, payment processing, or shipping. It delivers a platform via a RESTful API that comprises of a back-end data model and a cloud-based infrastructure. 

The headless commerce system works very much like a headless CMS, by passing requests between the presentation and application layers through web services or application programming interface (API) calls.

For example, when the user clicks a “Buy Now” button on their smartphone, the presentation layer of sends an API call to the application layer to process the order of the customer. The application layer then sends another API call to the presentation layer to show the customer the status of their order.

How is it Different From Traditional Commerce?

Here’s a look at the various features that differentiate headless commerce from traditional commerce, or make it a better choice:

headless commerce

Advantages of Using Headless Commerce

Headless commerce architecture brings in a number of advantages for the e-commerce businesses. Here’s a look at some of the reasons why you should consider going headless:

Omnichannel Experience

Using a headless CMS gives you the flexibility to propel your content anywhere and everywhere. For an e-commerce brand, that means the ability to deliver your products, demo videos, testimonials, blog posts, customer reviews etc to any channel that may have emerged.

More Customization and Personalization

As explained earlier, headless commerce systems give you the control to manage the look and feel of your brand. You can design a customized experience for your admins as well as customers right from scratch, without finding yourself hitting restrictions.


A decoupled architecture allows you to make rapid changes in the frontend without disturbing the backend, and vice versa. Also, new functionalities and integrations can be applied in much less time, because of the openness of the architecture. 

Faster Time to Market

Using headless commerce to build an omnichannel retail experience facilitates a faster time to market. Brands can focus solely on building frontend experiences across different touchpoints, as the content and products are housed centrally and delivered via API. 

Agile Marketing

Headless commerce system can support new technologies as and when they arise, making it perfect for designing new customer experiences. This enables marketing teams to roll out multiple sites across different brands, divisions and portfolios in a matter of few days.

Seamless Integrations

A headless commerce system uses APIs, which makes it easier to integrate and communicate with other platforms. You can also add your brand or e-commerce platform to any new device, and it’s just a matter of few hours.

Better Conversion Optimization

With headless commerce systems, you can easily deploy changes on your ecommerce platform. You can run multiple tests at once, and optimization based on these quickly, which will help you constantly improve your e-commerce experiences.

When Not to Use Headless Commerce?

You are aware now of the reasons you should use a headless commerce architecture. And while there are a lot of them, it is also prudent to note a couple of drawbacks that come with it:

Costs Involved

Headless commerce does not provide you with a front end, developers have to build their own from scratch. And this could be both time-consuming and costly. Plus, developers will need to troubleshoot their own front-end creations, leading to ongoing costs beyond the initial build.

Marketer Isolation

Since the headless commerce system offers no frontend presentation layer, marketers can no longer:

  • Create content in a WYSIWYG (what you see is what you get) environment
  • Preview how content will look like on the end user’s device or screen
  • Quickly ideate, approve, create, and publish content without relying on another department

This makes marketers totally dependent on the IT team, not just to build the front-end presentation layer, but also to update it and populate it with content. 

So then what should you do? Well, focus on your business requirements. 

Do the above listed drawbacks not mean much to your business compared to the advantages headless brings? Then you should absolutely go ahead. But if it does, there is a third solution. Decoupled commerce.

Decoupled commerce system is only different from headless commerce in that it doesn’t remove the frontend delivery layer from the backend entirely. It gives marketers back their power, while also giving the brand the same headless freedom needed to deliver content to different devices, applications, and touchpoints through APIs. It is, in a nutshell, the best of worlds. And your choice should entirely depend on your business needs.

At Srijan, we have expert Drupal teams to help e-commerce enterprises assess their current architecture, and identify if headless or decoupled is the way to go. Post assessment, we work to set up the respective Drupal architecture, based on your exact business requirements. 

Get ready to deliver immersive online experiences across customer touchpoints. Let’s get the discussion started on implementing decoupled Drupal for your enterprise.

Topics: Retail & Commerce, Architecture

Delivering CaaS with Acquia Content Cloud

Posted by Nilanjana on Oct 22, 2019 3:29:00 PM

The website is no longer the sole arena of your brand’s digital experiences. We’ve transcended that and now a customer’s interaction with your brand is fragmented across multiple different channels - from the tiny smartwatch on their wrist to giant digital displays, from their mobile application to their in-flight screens. So your content needs to be on all these channels as well. 

But short of hiring numerous content writers and editors to write and reformat content for all these different channels, how do you play this game? The answer is Content as a Service.

Drupal has already proved its mettle when it comes to managing a huge volume of content at the backend. And now Drupal is channeling it’s decoupled capabilities to deliver a streamlined platform for CaaS - the Acquia Content Cloud.

What is Acquia Content Cloud

Acquia Content Cloud is a platform that allows content creators to write, edit, and review content independent of the channel where it will be published. The content created here can be pushed to multiple different channels simultaneously or at different times, and will be automatically formatted to best fit the channel. 

In essence, Acquia Content Cloud enables headless content creation and management for delivering multichannel digital experiences.

Though built on Drupal 8.7, the Content Cloud is a CaaS solution that can be used irrespective of whether you website and other display applications run on Drupal. You also do not have to worry about setting up Drupal (or upgrading to a new version of Drupal) to be able to fully leverage Acquia Content Cloud. Because it’s being made available as a software, you will have everything you need enabled out of the box.

Why use Acquia Content Cloud?

The whole challenge with delivering a multichannel digital experience is the fact that different channels have different ways of consuming and displaying content. And that can lead to some significant challenges:

Tedious Content Reformatting

Publishing content on different channels means copying and pasting the same content into the editing platforms for each of these channels. 

Let’s say you are a news outlet publishing a particular article. Your content team will upload it on the website with a headline, byline, images, text etc, on the CMS. TO send out the same news in an email newsletter, your team will have to go into your emailing platform and enter the headline and maybe some text, depending upon the design and structure of your email. To showcase the same image on the large digital displays you have in the office lobby, the team has to log into a different interface, reenter the content, maybe the headline, the byline and the image this time. 

All of this is just your team repeating the same bunch of things in different display system backends. Time they could have spent is getting more stories out. This repetition also creates room for more errors and confusion at the last updated versions, and even more work when there are real-time developments in a story.

Difficult Authoring Interfaces

While your CMS maybe editor-friendly (it likely is, if it’s on Drupal) but not all your display channels are easy to use at the backend. The more complex they are, the more time you content team spends on publishing content on them. The lack of efficiency can quickly get in the way of effective and impactful digital experience.

Content Silos

When you are reformatting content and separately publishing on each platform, your content begins to exist in silos. There is no one single place where you can view all content, or check on exactly which display channels a particular content piece has been published on. You can also not track revisions, or know if a particular content has been updates on all the display channels it was published on.

Basically, what you have is a whole lot of confusion and very limited visibility into your content.

The Acquia Content Cloud eliminates these problems by being a centralized platform for content creation and editing, which can then push it out to different channels. Here, you can enter all your content into a well structured template, where it can be stored, managed, approved and revised. Different channels can consume this content via APIs, and display it as needed, with no need for reformatting. 

How does the Acquia Content Cloud work?

With Acquia Content Cloud, the content team can create each content piece to include complete information and different media formats without worrying about how it would look when displayed on a particular channel. 

The solution is designed on the concept of flexible or atomic content. The platform breaks down any content piece into different smaller parts, with each being entered in a different field. 

For example, you have your headline, by line, summary, rich media, and body text entered into different fields. All of these are now reusable components that can be picked up on the basis of which display channel we want to push the content to. 

So for the website, all the components of the content piece get pushed out. For the digital banner, only the headline, byline, and image component get pulled for display. All other channels similarly pull in the component they require to most effectively display that piece of content. 

The general content publishing workflow on the Acquia Content Cloud platform goes something like this:

  • The content writer starts with creating a project. Choosing the type of content they are creating - blog/promotional/case study etc, and choosing the different channels where they want the content to be displayed
  • Next, they create a complete content piece by filling out the different fields, including rich media like videos, gifs, animations and more
  • They can schedule a publish time, to the piece to be available on all channels simultaneously
  • The platform send review notification to the editor, who can review the content, make changes, or trigger a revision workflow if needed
  • All changes happen on the base version of the content, making it easier to track changes and keep all channel updated
  • Once okayed, the content is pulled to different applications via APIs. Multiple API formats like GraphQL, JSON etc are supported
  • Once published, content writers and editors can also go in to make display changes to any channel if needed

And that’s it. That all your content team has to do to ensure content is displayed well across multiple channels. You create your content once and publish everywhere. You content team is free to do actual content creation, rather than copy-paste to different channels. And your marketing team can rest assured that every brand interaction, on every channel, in optimized, updated, and immediate. 

Acquia Content Cloud is currently available for private beta testing and you can sign up for it to test it out yourself. 

Meanwhile, if you are looking to decoupled Drupal solutions to enable advanced digital experiences at your enterprise, Srijan’s experts Drupal teams can help. We are also Acquia implementation partners, helping brands leverage Acquia’s suite to personalization, customer journey orchestration, digital asset management and cloud hosting offerings. 

Tell a bit about your project and let's explore how our Drupal experts can help.

Topics: Drupal, Planet Drupal, Omnichannel, Acquia

5 smart content management solutions for marketers on Drupal

Posted by Nilanjana on Sep 18, 2019 3:26:00 PM

With every new release, Drupal is emerging a valuable asset in the enterprise marketing stack. The additions to Drupal core, especially with Drupal 8 and after, have made it a digital platform that comes equipped for all the standard marketing best practices right out of the gate. In addition to that, the larger Acquia ecosystem is also delivering new solutions that empower Drupal be more than just a CMS. These bring in some powerful martech capabilities that have made Drupal into a platform that’s ready to deliver the results that enterprise marketing teams want.

This post delves into the key modules and solutions that enable smart content management in Drupal, both in terms of creating and publishing content, as well as leveraging that content in diverse ways to drive results.

Smart Content

Smart Content is a Drupal toolset that can help deliver anonymous website personalization in real time, for Drupal 8 sites. Essentially, site admins get the ability to display different content to site visitors, based on whether they are authenticated or anonymous users.

Some examples of how you can leverage it include:

  • Displaying a smart block showcasing your latest offer or most popular blog to a first time visitor to the site
  • Displaying a smart block that showcases different industry specific case studies for different users in your database
  • Displaying a smart block only for mobile viewers of your site, maybe asking them to view it on your mobile app

Now this module in itself has limited functionality, but becomes very useful when used in combination two other Drupal modules:

Smart Content Blocks

Included within the Smart Content module, these allow you to insert a Smart Block on any page and set up conditions that govern the content being displayed within the block. These conditions can be used to hide or show a specific content in certain cases, and form the basic personalization rules for your Drupal site. The blocks have an easy interface within the Drupal 8 backend, giving you the flexibility to add any number of blocks, anywhere on a page. 

It's important to note that all of your content, irrespective of format, is available to show and promote through Smart Content Blocks. Ebooks, videos, images, blogs, service pages—anything that’s already in the Drupal CMS can be delivered to a block. 

Smart Content Segments

A complete set of conditions grouped together to achieve a reusable composite condition. For example, a set of the following three conditions:

  • showcase only on desktop
  • showcase if location is France
  • showcase for anonymous users


will create a smart content segment that can be applied to any smart content block to ensure that it's displayed to anonymous users from France, viewing the site on a desktop. This feature saves you time as you don' have to set up the same set of individual conditions every time.

At the heart of Smart Content are the conditions, editable rules that allow you to govern the display of content. The interface is easy to manage, and familiar to marketers working on a Drupal site. 


You have your choice of the basic conditions for personalization like the browser, language, device type etc. You also have the more advanced options like targeting different industries based on third party IP lookups, or tapping into existing segmentations or campaigns from a marketing automation system. Essentially, anything that has an API with available data can be used as conditions to help drive your personalization strategy with Smart Content.

Layout Builder

The Layout Builder module, experimental in Drupal 8.5 and 8.6, had a stable release with Drupal 8.7. This module allows content authors to easily build and change page layouts and configure the presentation of individual content, content types, media and nodes. It also allows you to add user data, views, fields and menus. 

This is a huge asset for enterprise marketing and digital experience teams because:

  • The module gives a drag-and-drop interface to create custom layouts for specific websites sections and pages, with the ability to override templates for individual landing pages when required
  • Content authors can seamlessly embed video across the site to create a more interactive user experience, and increase engagement and conversions
  • Marketers can now build and preview new pages at their own pace, without the fear of negatively affecting the existing user experience.

All of this means that marketing teams now have more control over the site, and can make changes and additions independently. This also reduces the turn-around-time for new campaigns by reducing, or even eliminating, dependencies on development teams. Think high-impact landing pages designed exactly as you want, but without the waiting around or constant back-and-forth with developers.

Media Library

With the release of Drupal 8.7, the CMS now has a stable media library module.

It provides a visually appealing interface for browsing through all the media items in your site. With the new version, multimedia properties can be added to content either by selecting from existing media or by uploading new media through bulk upload support. Once uploaded, users can remove or reorder any images ready for import. 

It provides an easy way to upload several media assets in your Drupal website quickly. Let’s you add alt-text, check the images before uploading.

Powered by Views, it allows site builders to customize the display, sorting, and filtering options.

Acquia Lightning

As enterprise marketing teams launch large scale campaigns, they often need to put together new microsites that work flawlessly. And they usually have to do it at a short notice, to leverage critical marketing opportunities in time. 

Having to depend upon the development teams to create one from scratch, and the constant coordination required to make that happen, can lead to the marketing team losing precious time. 

Acquia Lightning, an open source Drupal 8 distribution, is the perfect solution for this challenge. Lightning give you a basic ready-to-launch site with pre-selected modules and configurations that can cut development time by 30%. This allows:

  • Development teams to publish optimized Drupal 8 sites in short time frames
  • Editorial teams can easily work with layout. Media and content on these sites, and have them campaign-ready in no time

Some of the key features in Lightning that are particular great for marketers are:

Moderation Dashboard

This dashboard gives you  complete visibility into your Drupal content status, with a structured overview of where every pieces of content is in the editorial process. Besides tracking content status, you can also manage access controls determining who can access which pieces of content at the backend.

Screenshot 2019-10-01 at 6.50.09 AM

The key pieces of information you can view of the dashboard are:

  • Current drafts in progress
  • Content you created
  • Content needing review
  • Recent site activity
  • Individual editor activity in the last 30 days

Moderation Sidebar

Screenshot 2019-10-01 at 7.03.38 AM

The moderation sidebar allows you to stay on the website frontend as much as possible while making edits and managing the editorial process for any piece of content. Actions like editing text and layout, publishing a piece, creating new draft and more can be easily achieved with the sidebar. And it's quickly accessible by clicking "New Tasks" on any piece of content. For marketers no really keen on getting into the backend, this sidebar is a simple way to make the edits they need, with minimal chances of error. 



Scheduled Publishing

As the name suggests, this feature in Acquia Lightning allows you to set a piece to publish at a future date. This functionality give you a better view of when content is set to launch, and also ensure that it launches at optimal times, according to reader preferences. And this happens without you having to be on the job at odd hours, just waiting around to publish content.

Screenshot 2019-10-01 at 7.17.14 AM

You can schedule publish times from on individual pieces by editing the 'Current Status' to select “Schedule a Status Change” . Then choose “Published” and select your preferred publishing date and time.

Acquia Lift

We cannot talk of smart content management with Drupal without talking about Acquia Lift. For enterprise sites built on Drupal, there’s nothing more suitable for the personalization than Acquia Lift.

Acquia Lift is a solution designed to bring in-context, personalized experiences to life. It’s a powerful blend of data collection, content distribution, and personalization that enables enterprise marketing teams to closely tailor the user experience on the site. And all this without excessive dependence on development or IT teams.

Acquia Lift gives enterprises three key elements to drive their personalization and reflect it with their website content:

Profile Manager

This helps build a holistic 360 degree profile of your users, right from when they are anonymous visitors on the site, up until the stage where they are repeat customers. It collects user demographic data, historical behaviour data, and real-time interactions so you can get a complete understanding of who your users are, what they want, and then work on how best to deliver that.

Content Hub

The Content Hub is a cloud-based, secure content syndication, discovery and distribution tool. Any piece of content created within the enterprise can be aggregated and stored here, ready to be pushed out to any channel, in any format. 

Faceted search and automatic updates give visibility into the entire gamut of content being created within the enterprise - in different departments, across websites, and on different platforms.

Experience Builder

This is the heart of Acquia Lift - the element that allows you to actually build out a personalized experience from scratch. The Experience Builder is a completely drag-and-drop tool that lets you customize segments of your website to showcase different content to different target segments, based on data pulled from the Profile Manager.

Enterprise marketing teams can 

  • set up rules that define what content should be shown to which segment of site visitors
  • perform A/B tests to accurately determine what type of content drives more conversions for which user segments. 


All this can be done with simple overlays atop the existing website segments, without impacting the base site, and without depending on IT teams for implementation.

With a commitment to creating ambitious digital experiences, every new Drupal release has brought in new features to add to the marketing ecosystem. While the overarching focus is on being flexible and scalable, these solutions are creating real impact on customer experience, conversions, online sales and brand proliferation.

And for enterprise teams contemplating shifting to Drupal for diverse proprietary CMSes, the payoff from empowered marketing team alone makes it worth the effort.

While most of the features mentioned here can be accessed by your teams easily if they are already using Drupal, some require guidance. Especially Acquia Lightining and Acquia Lift will need skilled teams to set it up for you, before marketers can start reaping the benefits. 

If you are looking to deploy Lift or Lightning, just drop us a line and our Drupal experts will be in touch.

Topics: Drupal, MarTech

12 Factor Apps and Their Benefits For Cloud Native Applications

Posted by Nilanjana on Aug 30, 2019 5:32:00 PM
“Good code fails when you don’t have a good process and a platform to help you. Good teams fail when you don’t have a good culture that embraces DevOps, microservices and not giant monoliths, said Java Framework Spring’s Tim Spann when asked about the reason for choosing 12 factor apps.

If your enterprise team is often struggling with overly complicated, slowed-down app deployment, 12 factor app methodology should be the go to solution for you. 

What are 12 Factor Apps?

A methodology or process created specifically for building Software as a Service (SaaS) apps, the 12 Factor Applications can help you avoid headaches typically associated with long term enterprise software projects.

Laid down by Heroku founder, these 12 design principles function as an outline to guide the development of a good architecture.

  • They include defined practices around version control, environment configuration, isolated dependencies, executing apps as stateless resources and much more
  • Work with a combination of backing services like database, queue, memory cache
  • and, Utilize modern tools to build well structured and scalable cloud native applications

Interestingly however, they are not a recipe on how to design the full system, rather a set of prerequisites that can get your projects off to a great start. Here’s a look at the 12 factors

#1 Codebase

There should be only a single codebase per app, but multiple deployments are possible.

Multiple apps sharing the same code, violates the twelve-factor methodology. The solution is thus to factor the shared code into libraries which can be included through the dependency manager. 

As for multiple deployments, it’s possible with the same codebase being active across them, although in different versions.

#2 Dependencies

A 12-factor app relies on the explicit existence declaration of all dependencies, completely and exactly, via a dependency declaration manifest. And you must also use a dependency isolation tool along with dependency declaration.

  • Dependency declaration is required as it simplifies setup for developers who are new to the app.
  • Using dependency isolation tool during execution ensures that no implicit dependencies “leak in” from the surrounding system.
  • And using both dependency declaration and dependency isolation together is important, because only one is not sufficient to satisfy twelve-factor.

#3 Config

Apps storing config as constants in the code is a violation of the 12-factor. Instead, config should be stored in environment variables (env vars). Why? Config varies substantially across deploys, whereas code does not. Env vars are on the other hand, easy to change between deploys without changing any code.

Secondly, these env vars are independently managed for each deploy. They are never grouped together as environments. And this model scales up smoothly as the app naturally expands into more deploys over its lifetime.

#4 Backing services

Under the 12-factor process, backing services are treated as attached resources, independent of whether they are locally managed or third party services. They can be accessed easily via a URL or other credentials, and even swapped one for the other.

The result? If your app’s database is misbehaving because of a hardware issue, you can simply spin up a new database server restored from a recent backup. The current production database could be detached, and the new database attached – all without any code changes.

#5 Build, release, run

There should be strict separation between the build, release and run stages. This is done to ensure that no changes can be made to the code at runtime, since there is no way to propagate those changes back to the build stage.

Why is this necessary? Because runtime execution (unlike builds) can happen automatically. Such as if there is a server reboot, or a crashed process being restarted by the process manager, these problems which are preventing the app from running could also cause the code to break. And that could be a major problem, particularly if no developers are on hand.

#6 Processes

Twelve-factor processes are stateless and share-nothing. It is never assumed that anything cached in memory or on disk will be available on a future request or job. All the data compiling is done during the build stage, and everything that needs to persist is stored in a stateful backing service, typically a database.

#7 Port binding

A 12-factor app is completely self-contained, and does not rely on the runtime injection of a webserver into the execution environment to create a web-facing service. It always exports services via port binding, and listens to requests coming on that port.

Almost any kind of server software can be run via a process binding to a port, and awaiting incoming requests. Examples include HTTP, ejabberd, and Redis.

#8 Concurrency

To ensure the scalability of your app, you should deploy more copies of your application (processes) rather than trying to make your application larger. The share-nothing, horizontally partitionable nature of twelve-factor app processes means that adding more concurrency is a simple and reliable operation. 

To do this, the developer has to architect the app to handle diverse workloads by assigning each type of work to a process type. For example, HTTP requests may be handled by a web process, and long-running background tasks handled by a worker process.

#9 Disposability

The twelve-factor app’s processes are disposable, i.e.,

  • Startup time is minimal
  • Can shutdown gracefully at a moment’s notice
  • Robust against sudden crashes or failure

All of this facilitates fast elastic scaling, rapid code deployment or config changes, as well as robustness of production deploys.

#10 Dev/prod parity

A 12-factor app is designed for continuous deployment by minimizing the sync gap between development and production. Here’s how:

  • Time gap: reduced to hours
  • Personnel gap: code authors and deployers are the same people
  • Tools gap: using similar tools for development and production

Keeping development, staging and production as similar as possible will ensure anyone can understand and release it. This ensures great development with limited errors, and also enables better scalability.

#11 Logs

Twelve-factor apps should not be concerned about routing and storage of it’s output stream or writing/managing log files. Instead, each running process writes its event stream, unbuffered, to stdout. During local development, the developer will view this stream in the foreground of their terminal to observe the app’s behavior.

This factor is more about excellence than adequacy. While success is possible even without logs as event streams, the pay-off of doing this is significant. 

#12 Admin processes

Apps should run admin/management tasks as one-off processes, in an identical environment as the regular long-running processes of the app. They run against a release, using the same codebase and config as any process run against that release. Admin code must ship with application code to avoid synchronization issues. And dependency isolation techniques should also be the same for all process types.

This factor is more from a managing your app point-of-view than developing services, but is still important.

The Underlying Benefits

Thus the 12-factor apps methodology helps create enterprise applications that:

  • Use declarative formats for setup automation. This minimizes the time and cost for new developers joining the project
  • Have a clean contract with the underlying operating system, offering maximum portability between execution environments
  • Are suitable for deployment on modern cloud platforms, thus removing the need for servers and systems administration
  • Limits differences between development and production, enabling continuous deployment for maximum agility
  • Can scale up without any major changes to tooling, architecture, or development practices, hence performance is a priority

Should You use the 12 Factors?

You are now well aware of the 12 factor apps methodology, as well the advantages they bring. But is it always a great choice to make? Probably not. 

If you are an enterprise with a development team that is still trying to overcome the baggage of your legacy, on-premise applications, 12 factor is not ready for you. The right use case would be for those new apps or instances where you’ve already started the refactoring process for a brownfield project that you’re completely reworking. Or when you are building new cloud-native applications, that's when you definitely need 12 factor apps. 

It’s all about deciding what your main problem is and if this methodology can solve that problem. And of course, as always, you should prioritize what works for your team.

Our expert development teams at Srijan can help you understand your enterprise project requirements, and whether 12 factor apps can ensure a better app architecture. To know more, book a consultation.

Topics: Cloud, Architecture

Data Lake Strategy:  6 Common Mistakes to Avoid During Implementation

Posted by Nilanjana on Aug 29, 2019 5:42:00 PM

While we have talked a lot about the rising need for data lakes, it’s probably as important to talk about how easily they can go wrong in the absence of a good data lake strategy. While most businesses expect phenomenal insights, not enough attention is paid to actually setting it up in the right manner. And that is where it can all start to unravel. 

It's not uncommon to see scenarios where businesses have invested a lot of time, money and resources into building a data lake but it’s actually not being used. It can be that people are slow to adopt it or it could be that faulty implementation actually made the data lake useless. 

So here, we take a brief look at six common data lake strategy pitfalls, and how to avoid them. 

Challenges involved in Loading Data 

There are two challenges involved when loading data into a data lake:

Managing big data file systems requires loading an entire file at a time. While this is no big deal for small file types, doing the same for large tables and files becomes cumbersome. Hence to minimize the time for large data sets, you can try loading the entire data set once, followed by loading only the incremental changes. So you can simply identify the source data rows that have changed, and then merge those changes with the existing tables in the data lake.

Data lake consumes too much capacity to load data from the same data source into different parts of the data lake. As a result, the data lake gets a bad reputation for interrupting operational databases that are used to run the business. To ensure this doesn’t happen, strong governance processes are required.

Lack of Pre-planning

Data lakes can store an unfathomable amount of data, but not planning the value of data before dumping it is one major reason for their failure. While the point of a data lake is to have all of your company’s data in it, it is still important that you build data lakes in accordance with your specific needs. Balancing the kind of data you need with the amount of data you dump into the data lake ensures the challenges of the data lake implementation is minimized.

Uncatalogued Data

When you store data into a data lake, you also need to make sure it is easy for analysts to find it. Merely storing all the data at once, without cataloguing is a big mistake for a few key reasons

  • Can lead to accidental loading of the same data source more than once, eating into storage
  • Ensuring metadata storage is key to a data lake that’s actually useful. There are several technologies available to set up your data cataloging process. You can also automate it within your data lake architecture with solutions like AWS Glue. 

Duplication of Data

When Hadoop distributions or clusters pop up all over the enterprise, there is a good chance you’re storing loads of duplicated data. As a result, data silos are created which inhibits big data analytics because employees can’t perform comprehensive analyses using all of the data.

All of this essentially re-creates the data proliferation problem data lakes were created to solve in the first place.

Inelastic Architecture

On of the most common mistakes organizations make is building their data lakes with inelastic architecture. Several of them start out with one server at a time, slowly and organically growing their big data environment, and adding high performance servers to keep up with the business demands. While this decision is taken because data storage can be costly, it eventually proves to be a mistake in the long run when the growth of data storage outpaces the growth of computing needs and maintaining such a large, physical environment becomes cumbersome and problematic.

Not the Right Governance Process

Not using the right governance process can be another obstacle to your data lake implementation. 

  • Too much governance imposes so many restrictions on who can view, access, and work on the data that no one ends up being able to access the lake, rendering the data useless
  • Not enough governance means that organizations lack proper data stewards, tools, and policies to manage access to the data. Unorganized and mismanaged data lakes can lead to an accumulation of low quality data, which is polluted or tampered with. Eventually the business stops trusting this data, rendering the entire data lake useless

Implementing good governance process and documenting your data lineage thoroughly can help illuminate the actions people took to ingest and transform data as it enters and moves through your data lake.

While this is by no means an exhaustive list, these are some of the most seen mistakes that businesses make. Plugging these holes in your data lake strategy sets you up for better returns from your initiative right out the gate. It also ensures that your data lake does not become a data swamp where information and insights disappear without a trace.

Working on a data lake strategy for your enterprise? Or building the right data lake architecture to leverage and monetize your data?

Tell us a bit about your project and our experts will be in touch to explore how Srijan can help.

Topics: Project Management, Agile, Data Engineering & Analytics

Data Lake vs Data Warehouse: Do you need both?

Posted by Nilanjana on Jul 17, 2019 3:20:00 PM

Most enterprises today have a data warehouse in place that is accessed by a variety of BI tools to aid the decision making process. These have been in use since several decades now and served the enterprise data requirements quite well. 

However, as the volume and types of data being collected expands, there’s also a lot more that can be done with it. Most of these are use cases that an enterprise might not even have identified yet. And they won’t be able to do that until they have had a chance to actually play around with the data. 

That is where the data lake makes an entrance. 

We took a brief look at the difference between a data warehouse and lake when defining what is a data lake. So in this blog, we’ll dig a little deeper into the data lake vs data warehouse aspect, and try to understand if it’s a case of the new replacing the old or if the two are actually complementary.

Data lake vs. Data Warehouse

The data warehouse and data lake differ on 3 key aspects:

Data Structure

A data warehouse is much like an actual warehouse in terms of how data is stored. Everything is neatly labelled and categorized and stored in a particular order. Similarly, enterprise data is first processed and converted into a particular format before being accepted into the data warehouse. Also, the data comes in only from a select number of sources, and powers only a set of predetermined applications. 

On the other hand, a data lake is a vast and flexible repository where raw, unprocessed data can be stored. The data is mostly in unstructured or semi-structured format with the potential to be used by any existing business application, or ones that an enterprise could think of in the future.

The difference in data structure also translates into a critical cost advantage for the data lake. Cleaning and processing raw data to apply a particular schema on it is a time consuming process. And changing this schema at a later date is also laborious and expensive. But because the data lakes do not require a schema to be applied before ingesting the data, they can hold a larger quantity and wider variety of data, at a fraction of the cost of data warehouses.


Data warehouses demand structured data because how that data is going to be used is already defined. As the cleaning and processing of data is already expensive, the aim with data warehouses is to be as efficient with storage space as possible. So the purpose of every piece of data is known, with regards to what will be delivered to which business applications. That ensures that space is optimized to the maximum.

The purpose of the data flowing into the data lake is not determined. It’s a place to collect and hold the data, and where and how it will be used is decided later on. It usually depends on how that data is being explored and experimented with, and the requirements that arise with innovations within the enterprise.


Data lakes are overall more accessible as compared to data warehouses. Data in a data lake can be easily accessed and changed because it’s stored in the raw format. On the other hand, data existing in the data warehouse takes a lot of time and effort to be transformed into a different format. Data manipulation is also expensive in this case.

Does the data lake replace the data warehouse?

No. A data lake does not replace the data warehouse, but rather complements it. 

The organized storage of information in data warehouses makes it very easy to get answers to predictable questions. When you know that business stakeholders need certain pieces of information, or analyze specific data sets or metrics regularly, the data warehouse is sufficient. It is built to ingest data in the schema that will quickly give the required answers. For example: revenue, sales in a particular region, YoY increase in sales, business performance trends - all can be handled by the data warehouse. 

But as enterprises begin to collect more types of data, and want to explore more possibilities from it, the data lake becomes a crucial addition.

As discussed, schema is applied to the data after it’s loaded into the data lake. This is usually done at the point when the data is about to be used for a particular purpose. How the data fits into a particular use case determines what schema will be projected onto it. This means that data, once loaded, can be used for a variety of purposes, and across different business applications. 

This flexibility makes it possible for data scientists to experiment with the data to figure out what it can be leveraged for. They can set up quick models to parse through the data, identify patterns, evaluate the potential business opportunities. The metadata created and stored alongside the raw data makes it possible to try out different schemas, view the data in different structured formats, to discover which ones are valuable to the enterprise. 

Given these characteristics of the data lake, it can augment a data warehouse in a few different ways:

  • Start exploring the potential of the data you collect, beyond the structured capabilities of your current data warehouse. This could be around new products and services you can create with these data assets, or even enhance your current processes. For example: leverage data lake to gather information of site visitors and use that to drive more personalized buyer journeys and evolving marketing strategies.
  • Use the data lake as a preparatory environment to process large data sets before feeding them into your data warehouse
  • Easily work with streaming data, as the data lake is not limited to batch-based periodic updates.

     The bottomline is, the data warehouse continues to be a key part of the enterprise data architecture. It           keeps your BI tools running and allows different stakeholders to quickly access the data they need. 

But the data lake implementation further strengthens your business because:

  • You have access to a greater amount of data that can be stored for use, irrespective of its structure or quality
  • Storage is cost effective because it eliminates the need for processing the data before storage
  • Data can be used for a larger variety of purposes without having to bear the cost of restructuring it into different formats
  • The flexibility to run the data through different models and applications makes it easier and faster to identify new use cases

In a market where the ability to leverage data in novel ways offers a critical competitive advantage, the focus should no longer be on data lake vs data warehouses. If enterprises want to stay ahead, they will have to realise the complementary functions of the data warehouse and the lake, and work towards a model that gets the best out of both.

Interested in exploring how a data lake fits into your enterprise infrastructure? Talk to our expert team, and let’s find out how Srijan can help.

Topics: Data Engineering & Analytics

Understanding Cloud Native Applications - What, Why, How

Posted by Nilanjana on Jul 10, 2019 1:23:00 PM

What are Cloud Native Applications?

Cloud native applications are the ones that are designed to optimally leverage the benefits of the cloud computing delivery model. The applications live in the cloud and not on an on-premise data centre. However, merely existing on the cloud does not make an application ‘cloud native’. The term refers to a fundamental change in how applications are developed and deployed, and not just where they are hosted. 

Cloud native applications are best described by a set of key characteristics that differentiate them from traditional applications:

  • Microservices architecture: They are built as a collection of loosely coupled services that handle different functions of the applications. Using the microservices architecture instead of the monolithic approach is what gives cloud native applications much of their speed and scalability.

  • 12 Factor applications: This refers to a set of 12 design principles laid out by Heroku founder to help create applications that are well suited for the cloud. These include defined practices around version control, environment configuration, isolated dependencies, executing apps as stateless resources and more
  • Platform-as-a-Service: Because cloud native apps run on microservices which can number into 100s for any given application, provisioning new environments for each services in the traditional way is time and resource intensive. Using Platform-as-a-Service (PaaA) simplifies this process and can handle rapid provisioning for numerous microservices instances. This is also key to ensuring scalability of cloud native applications.
  • API-based: Independent microservices in a cloud native application communicate via API calls. This preserves their loosely coupled nature and keeps the application fast and scalable.
  • Robust: Cloud native applications are robust, with minimal to zero downtime. Once again the microservices architecture, coupled with being on a highly available cloud environment, makes this possible.

Why go for Cloud Native Applications?

The manner in which cloud native applications are developed brings with it a distinct set of advantages for enterprises. These are:


In a disruption heavy market, the time-to-market for new products and services is extremely crucial to success. Reaching potential customers before your competitors means achieving a faster go-to-market, and that’s possible with cloud native applications. The microservices architecture makes them easy to develop, test and deploy, as compared to monolithic applications. 

These applications also work with smaller but more frequent release cycles, that are easily reversible. So you can constantly introduce new features, functions and bug fixes for your applications, while also having the option of quick rollbacks if needed. 

Finally, with independent microservices, updates to a service need not be integrated with the code of the rest of the services. With the integration time eliminated, new functionalities can be quickly rolled out for these applications.


The microservices architecture makes cloud native applications extremely scalable. This is because each microservice handles a specific function within an application. In cases of increase in demand, the application can be scaled by creating more instances of only those services that are needed to handle that demand. And provisioning new instances of a microservice can be done in seconds because the application is based on the PaaS model. 

Besides this, with cloud providers like AWS you get auto-scaling and elastic load balancing solutions that make it easier to dynamically scale resource utilization for cloud native applications.

Cost Efficiency

For monolithic applications, scaling to meet new demand involves creating a new instance of the entire monolith, and that is both a time and resource intensive process. It also means paying for more hardware resources in the cloud, even though the actual demand spike is only for a limited set of features.

With cloud native applications, scaling means increasing instances for only specific microservices. And that saves money as it eliminates the need to consume resources that will not be utilized. Also, it’s easy to turn off your consumption of extra resources once the spike in demand subsides.

There are also secondary cost savings generated with cloud native apps, in the form of multitenancy. Several different microservices can dynamically share platform resources leading to reduced expenditure.


Cloud native applications are extremely available and that’s also because of their microservices architecture. This works at two levels:

  • If one service goes down, the rest of the applications still continues to be available. This is because the application is designed with failsafes, and can always provision another working instance of the failed microservice.
  • The containerized nature of microservices mean that they are packaged with their runtime environment. This makes them self-sufficient and designed to work uninterrupted, no matter where they are hosted. So in case an entire availability region of your cloud goes down, the application can simply be moved to a different region. And it will continue to be available, with your users none the wiser. 

How to get started with Cloud Native Applications?

Building cloud native applications involves a large scale change in how applications are developed and deployed within the organization. So getting started with it will require some preparation on the part of the enterprise. 

Some of the key aspects to consider would be:

Create your enterprise strategy

The shift to cloud native applications is being considered because it serves specific business goals - creating new products and services, gaining new market share, or increasing revenues. And these business goals is what should be kept front and center while creating your strategy for going cloud native. 

This will also help you avoid the trap of going down the technology-first route. Yes, cloud native applications will involve the use of new technology - languages, frameworks, platforms - by your team. But deciding to first lock down the technology aspects can be disastrous. That’s because the technology you choose should be able to serve your business goals. And if you haven’t figured those out first, the initiative will not be successful or sustainable.

So a good order of priority here is identifying:

  • Business goals to achieve with going cloud native
  • Right teams that can lead this, both in-house and as partners/vendors
  • Technology solutions that best suit your requirements

Transition away from the monolithic application

If you are working with a fairly complex monolithic application that has been put together over time, resist the temptation of a simple lift-and-shift to the cloud. Because of the tight coupling and the myriad dependencies that have developed over the years, it’s unlikely the monolith will run well on the cloud. So you need to plan for breaking down the monolith into constituent services that can be shifted to the cloud.

Moving towards a microservices architecture can seem daunting at first because you are dealing with 100s of different services instead of a single one. However, with practices like event sourcing microservices, deployment with docker, and a host of other design guidelines of building an optimal microservices architecture, the process can be well understood and executed. 

CI/CD approach

Adopting a continuous integration/continuous development approach is key to leveraging the speed benefits for cloud native applications. The system for rapidly developing and testing new features and pushing them out for use, as well as breaking down the traditional software development team silos is crucial for cloud native applications. Frequent, well-tested releases help keep your cloud native application updated and allow for continuous improvement.

So that was a quick look at understanding cloud native applications, their advantages, and where to get started. Moving forward, you would also need to identify your cloud platform of choice, and our take on building cloud native applications with AWS might be helpful.

Srijan is assisting enterprises in modernizing applications with microservices architecture, primarily leveraging Docker and Kubernetes. Srijan is also an AWS Advanced Consulting Partner, with AWS certified teams that have the experience of working with a range of AWS products and delivering cost-effective solutions to global enterprises.

Ready to build modernize your application architecture with microservices? Just drop us a line and our expert team will be in touch.

Topics: Microservices, Cloud, Architecture

Data Lake Implementation - Expected Stages and Key Considerations

Posted by Nilanjana on Jun 17, 2019 4:01:00 PM


Efficient data management is a key priority for enterprises today. And it’s not just to drive effective decision-making for business stakeholders, but also for a range of other business processes like personalization, IoT data monitoring, asset performance management and more.


Most enterprises are maturing out of their traditional data warehouses and moving to data lakes. In one of our recent posts, we covered what is a data lake, how it’s different from a data warehouse, and the exact advantages it brings to enterprises. Moving a step further, this post will focus on what enterprises can expect as they start their data lake implementation. This mostly centres around the typical data lake development and maturity path, as well as some key questions that enterprises will have to answer before and during the process.

Enterprise Data Lake Implementation - The Stages

Like all major technology overhauls in an enterprise, it makes sense to approach the data lake implementation in an agile manner. This basically means setting up a sort of MVP data lake that your teams can test out, in terms of data quality, storage, access and analytics processes. And then you can move on to adding more complexity with each advancing stage. 

Most companies go through the basic four stages of data lake development and maturity

Data Lake Implementation

Stage 1 - The Basic Data Lake

At this stage you’ve just started putting the basic data storage functionality in place. The team working on setting up the data lake have made all the major choices in terms of using legacy or cloud-based technology for the data lake. They have also settled upon the right security and governance practices that you want to bake into the infrastructure.

With a plan in place, the team builds a scalable but currently low-cost data lake, separate from the core IT systems. It’s a small addition to your core technology stack, with minimal impact on existing infrastructure. 

In terms of capability, the Stage 1 data lake can:

  • Store raw data coming in from different enterprise sources
  • Combine data from internal and external sources to provide enriched information

Stage 2 - The Sandbox

The next stage involves opening up the data lake to data scientists, as a sandbox to run preliminary experiments. Because data collection and acquisition is now taken care of, data scientists can focus on finding innovative ways to put the raw data to use. They can bring is open-source or commercial analytics tools to create required test beds, and work on creating new analytics models aligned with different business use cases.

Stage 3 - Complement Data Warehouses

The third stage of data lake implementation is when enterprises use it as complementary to existing data warehouses. While data warehouses focus on high-intensity extraction from relational databases, low-intensity extraction and cold or rarely used data is moved to the data lakes. This ensures that the data warehouses don’t exceed storage limits, while low priority data sets still get stored. The data lake offers an opportunity to generate insights from this data, or query it to find information not indexed by traditional databases.

Stage 4 - Drive Data Operations

The final stage of maturity is when the data lake become a core part of the enterprise data architecture, and actually drives all data operations. At this point, the data lake will have replaced other data stores and warehouses, and is now the single source of all data flowing through the enterprise. 

The data lake now enables the enterprise to:

  • Build complex data analytics programs that serve various business use cases
  • Create dashboard interfaces that combine insights from the data lake as well as other application or sources
  • Deploy advanced analytics or machine learning algorithms, as the data lake manages compute-intensive tasks

This stage also means that the enterprise has put in place strong security and governance measures to optimally maintain the data lake. 

Points to Consider to Before Data Lake Implementation

While the agile approach is a great way to get things off the ground, there are always roadblocks that can kill the momentum on the data lake initiative. In most cases, these blocks are in the form of some infrastructural and process decisions that need to be made, to proceed with the data lake implementation. 

Stopping to think about and answer these questions in the middle of the project can cause delays because now you also have to consider the impact of these decisions on work that’s already been done. And that’s just putting too many constraints into the project, 

So here’s a look at a few key considerations to get out of the way, before you embark on a data lake project:

Pin Down the Use Cases

Most teams jump to technology considerations around a data lake as their first point of discussion. However, defining a few most impactful use cases for the data lake should take priority over deciding the technology involved. That’s because these defined use cases will help you showcase some immediate returns and business impact of the data lake. And that will be key to maintaining project support from the higher up the chain of command, and project momentum.

Physical Storage - Get It Right

The primary objective of the data lake is storing the vast amount of enterprise data generated, in their raw format. Most data lakes will have a core storage layer to hold raw or very lightly processed data. Additional processing layers are added on top of this core layer, to structure and process the raw data for consumption into different application and BI dashboards.

Now, you can have your data lake built on legacy data storage solutions like Hadoop or on cloud-based ones, as offered by AWS, Google or Microsoft. But given the amount of data being generated and leveraged by enterprises in recent times, the choice of data storage should consider:

  • Your data lake architecture should be capable of scaling with your needs, and not run into unexpected capacity limits
  • Should be designed to support structured, semi-structured and unstructured data all in a central repository
  • Building a core layer that can ingest raw data, so a diverse range of schema can be applied as needed at the point of consumption
  • Ideally decouple the storage and computation functions, allowing them to scale independently

Handling Metadata

Because information in the data lake is in the raw format, it can be queried and utilized for multiple different purposes, by different applications. But to make that possible, usable metadata that reflects technical and business meaning also has to be stored alongside the data. The ideal way is to have a separate metadata layer that allows for different schema to be applied on the right data sets. 

A few important elements to consider while designing a metadata layer are:

  • Make metadata creation mandatory for all data being ingested into the data lake from all sources
  • You can also automate the creation of metadata by extracting information from the source material. This is possible if you are on a cloud-based data lake

Security and Governance

The security and governance of an enterprise data should be baked in the design from the start, and be aligned with the overall security and compliance practices within the enterprise. Some key pointers to ensure here:

  • Data encryption, both for data in storage and in transit. Most cloud-based solutions provide encryption by default, for core and processed data storage layers
  • Implementing network level restrictions to block big chunks of inappropriate access paths
  • Create fine-grained access controls, in tandem with the organization-wide authentication and authorization protocols
  • Create a data lake architecture that enforces basic data governance rules like the compulsory addition of metadata, or defined data completeness, accuracy, consistency requirements.

With these questions answered in advance, your data lake implementation will move at a consistent pace. 

Interested in exploring how a data lake fits into your enterprise infrastructure? Talk to our expert team, and let’s find out how Srijan can help.

Topics: Data Engineering & Analytics, Architecture

What is a Data Lake - The basics

Posted by Nilanjana on May 31, 2019 3:49:00 PM

In the next 10 years, the global generation of data will grow from 16 zettabytes, to 160 zettabytes, says an estimate by IDC. In addition to this, the forecast by Deloitte claims that unstructured data is set to grow at twice that rate, with the average financial institution accumulating 9 times more unstructured data than structured data by 2020. And it stands to reason that data generation by enterprises in every industry will increase in a similar fashion.

All this data is crucial for businesses - for understanding trends, formulating strategies, understanding customer behaviour and preferences, catering to those requirements and building new products and services. But actually gathering, storing and working with data is never an easy task. Yes, the sheer volume of data seems intimidating, but that’s the least of our problems.

The fact that data is stored fragmented, in silos across the organization, or that a lot of enterprise data is never used because it’s not in the right format are currently some of the biggest challenges for enterprise working with big data.

Solution? Data lake.

What is a Data Lake?

A data lake is a part of the data management system of an enterprise, designed to serve as a centralized repository for any data, of any size, in its raw and native format. The most important element to note here is that a data lake architecture can store unstructured and unorganized data in its natural form for later use. This data is tagged with multiple relevant markers so it’s easy to search with any related query.

Data lakes operate on the ELT strategy:

  • Extract data from various sources like websites, mobile apps, social media etc
  • Load data in the data lake, in its native format
  • Transform it later to derive meaningful insights as and when there is a specific     business requirement.

    Since it is raw, the data can be transformed in the format of choice and convenience. When a business question arises, the data lake can be searched for relevant data sets which can be analyzed to help answer those questions. This is possible because the schema of the stored data are not defined in the repository, unless it is required by a business process.


This possibility of exploration and free association of unstructured data often leads to the discovery of more interesting insights than predicted.

How is Data Lake Different from a Data Warehouse 

A data Lake is often mistaken for a different version of a data warehouse. Though the basic function is the same – data storage, they both differ in the way information is stored in them.

Storing information in data warehouses requires properly defining the data, converting it into acceptable formats and defining its use case beforehand. In the process of data storage in a warehouse, the ‘transformation’ step of the ELT strategy comes before the ‘Loading’ phase. With a data warehouse:

  • Data is always structured and organized before being stored

  • Sources of data collection are limited

  • Data usage may be limited to a few pre-defined operational purposes and it may not be possible to exploit it to its highest potential

What are the Advantages of a Data Lake Architecture

Given the fact that enterprises collect huge volumes of data is different systems across the organization, a data lake can go a long way in helping leverage it all. Some of the key reasons to build a data lake are:

  • Diverse sources: Generally, data repositories can accept data from limited sources, after it has been cleaned and transformed. Unlike those, data lakes store data from a large range of data sources like social media, IoT devices, mobile apps etc. This is irrespective of the structure and format of the data, and ensures that data from any business system is available for usage, whenever required.
  • Ease of access to data: Not only does a data lake store information coming from various sources; it also makes it available for anyone in need of required data. Any business system can query the data lake for the right data, and define how it is processed and transformed to derive specific insights.
  • Security: Although anyone can freely access any data in the lake, access to the information about the source of that data can be restricted. This makes any data exploitation, beyond requirement, very difficult.
  • Ease of usage of data: The unprocessed data stored directly from the source allows greater freedom of usage to the information seeker. Data scientists and business systems working with the data do not need to adhere to a specific format while working with the data.
  • Cost effective: Data lakes are a single platform, cost effective solution for storing large data coming from various sources within and outside the organization. Because a data lake is capable of storing all kinds of data, and easily scalable to accommodate growing volumes, it is a one-time investment for enterprises to get it in place. Integrating a data lake with your cloud is another option which allows you to control your cost as you only pay for the space you actually use.
  • Analytics: Data lake architecture, when integrated with enterprise search and analytics techniques, can help firms derive insights from the vast structured and unstructured data stored. A data lake is capable of utilizing large quantities of coherent data along with deep learning algorithms to identify information that powers real-time advanced analytics. Processing raw data is very useful for machine learning, predictive analysis and data profiling.

Data Lake Use Cases

With the sheer variety and volume of data being stored, data lakes can be leveraged for a variety of use cases. A few of the most impactful ones would be:

Marketing Data Lake

The increasing focus on customer experience and personalization in marketing has data at the heart of it. Customer information, whether anonymized or personal, forms the base for understanding and personalizing for the user. Coupled with data on customer activity on the website, social media, transactions etc, it allows enterprise marketing teams to know and predict what their customers need.

With a marketing data lake, enterprises can gather data from external and internal systems and drop it all in one place. The possibilities with this data can be at several levels:

  • Basic analytics can help get a comprehensive look into persona profiles and campaign performance
  • Unstructured data coming from disparate sources can be queried and leverage to form basic and advanced personalization and recommendation engines for users
  • Moving further, a 360 degree view of individual customers can be formed with a data lake, pulling together information on customer journey, preferences, social media activity, sentiment analysis and more. Because of the sheer diversity of data, it is possible to drill down into any aspect of the customer lifecycle
  • Beyond this, enterprises can have data scientists perform exploratory analysis, look at the wide spectrum of data available, build some statistical models and check if any new patterns and insights emerge.

Cyber Security

Securing business information and assets is a crucial requirement for enterprises. This means cyber security data collection and analysis has to be proactive and always on. All such data can be constantly collected in data lakes, given its ability to store undefined data. It can also be constantly or periodically analyzed in order to identify any anomalies and their causes, to spot and nullify cyber threats in time.

Log Analytics

A lot of enterprises today rely on IoT data streaming in from various devices. A data lake can be the perfect storage solution to house this continuously expanding data stream. Teams also run quick cleaning processes on it and make it available for analysis across different business functions.

So that was a quick look at what is a data lake and why enterprises should consider building one. Moving forward, we’ll dive into how exactly to set up a data lake and the different levels of maturity for enterprise data lakes.

Interested in exploring how a data lake fits into your enterprise infrastructure? Talk to our expert team, and let’s find out how Srijan can help.

Topics: Data Engineering & Analytics

Drupal Cache

Posted by Nilanjana on Oct 26, 2018 12:00:00 AM
Here's a preview on how our Drupal training is proceeding with Jacob Singh. After a day long session hacking around on Drupal, we spent some time understanding best practices to follow in case of websites with large amounts of data. 

Write to us

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms Of Service apply. By submitting this form, you agree to our Privacy Policy.

See how our uniquely collaborative work style, can help you redesign your business.

Contact us