This one should catch your attention. When 37 Signals decided they’re leaving the cloud, they received a lot of questions about their actual spending. After some time, there is a quick update where they are, and what their plans for 2023 are. Enjoy!
Cloud data warehouses have become extremely popular in recent years. Their low cost and fully managed services make it easy for businesses to get started and scale their data analysis efforts as needed. However, the pricing models for these services can be complicated, with a lot of factors affecting cost.The choice between Snowflake and BigQuery will depend on the organization's specific needs and usage patterns. Without further ado let’s dig deeper into Jakub Jurczak’s post and find out which solution you should choose.
2022 has been one of the hardest years ever to run a business. All sorts of challenges pushed engineers to look at their technical stacks from different perspectives, thinking from how to scale the system to how to control the cost to make the business more resilient. These four tips can have a huge impact on your business and allow the company to allocate money to more critical domains.
What changed after the Similiarweb team decided to use Databricks clusters as the compute for their Batch API? A story about how they were able to reduce their monthly Databricks costs from $25,000 to just $12,500 by making a few key changes to their setup. Also, you will find a reminder with 4 things you should do:
- Analysis
- Initiative
- Patience
- Clicking (but what exactly? Check it out in the blogpost!)
How a modern data architecture on AWS helped BMS to easily share data across organizational boundaries. Read how the AWS and Minfy Technologies teams helped a company from India choose the correct technology services and complete the migration in four months. The solution overview, walkthrough, and benefits of a modern data architecture are waiting for you.
Open source Kafka had helped Michelin jumpstart their event-driven transformation, so it was time for the company’s next bold move—a leap to the cloud. With the help of Confluent Cloud, a fully managed, cloud-native Kafka service, the company embarked on a cloud transition. Read how #BeEvergreen #BeDataDriven criteria were important to them and what changed after this reformation.
A tutorial on how to easily replicate all your relational data stores into a transactional data lake in an automated fashion with a single ETL job using Apache Iceberg and AWS Glue. After deploying this solution, you have automated the ingestion of your tables on a single relational data source.
This one answers the question as to how can customers leverage the support for Unity Catalog objects in the Databricks Terraform provider to manage a distributed governance pattern on the lakehouse effectively?
You will find two solutions here:
- One that completely delegates responsibilities to teams when it comes to creating assets in the Unity Catalog
- One that limits which resources teams can create in the Unity Catalog
BigQuery is removing its limit as a SQL-only interface and providing new developer extensions for workloads that require programming beyond SQL. These flexible programming extensions are all offered without the limitations of running virtual servers. What does it bring?
- BigQuery Stored Procedures for Apache Spark
- Google Colab Integration with BigQuery Console
- Remote Functions now GA
We are pleased to announce the General Availability (GA) of support for orchestrating dbt projects in Databricks Workflows. Since the start of Public Preview, we have hundreds of customers leverage this integration with dbt to collaboratively transform, test, and document data in Databricks SQL warehouses.
In this episode you will listen about information on global network performance of some of the biggest public cloud providers. The sponsor ThousandEyes, a Cisco company, has a worldwide network of sensors that measure performance to, from and across AWS, Azure and Google Cloud.
Subjects that were discussed:
- Highlights of the 2022 report
- Why small outages can be as impactful as bigger ones
- Cloud performance is not a “steady state”
- Why cloud-to-cloud performance looks pretty good
- The role of networking in cloud design and application performance
…and more.
Hear firsthand from technology leaders at companies such as Apple, Uber, Bloomberg, AWS, and Microsoft about their experiences architecting and building modern cloud data lakes. Learn how to innovate with open source technologies such as Apache Arrow, Apache Iceberg, Nessie, Delta Lake, Airflow, Dagster, Apache Superset, Apache Druid, Apache Ranger and more.
Take your AI/ML skills to the next level today! Get hands-on and step-by-step architectural and deployment best practices to help you build better, innovate faster and deploy at scale. Whether you are just getting started with AI/ML, an advanced user, or simply curious about AI/ML, we have a specific track for your level of experience and job role.
Discover the power of the Databricks Lakehouse at our series of live Lakehouse Days across EMEA. Join and find out how the lakehouse architecture unifies your data, analytics and AI, combining the best of data warehouses and data lakes on one simple platform. Built on an open and reliable data foundation that efficiently handles all data types, the lakehouse applies one common security and governance approach across all your data and cloud platforms.