Good data quality starts with metadata—and the importance of quality data cannot be overstated. Poor data quality costs the typical company between 10% and 20% of their revenue. In addition, high quality data is crucial for complying with regulations.
Posted September 09, 2020
The database world is in tumult these days. There are new requirements and new capabilities that organizations are adopting and integrating into their data persistence infrastructure all the time. The world is no longer relational/SQL-only. Organizations are adopting NoSQL database systems to support specific use cases and types of workloads. This is increasing the complexity of how data is managed. But it is not just NoSQL that is driving organizations to run multiple DBMSs. Many organizations have more than one relational DBMS. They may run Db2 on the mainframe and Linux, Oracle on UNIX, and SQL Server on Windows, and perhaps have a few MySQL instances, too. And DBAs are managing a lot of different database instances.
Posted August 11, 2020
Both development and production database administration are required to support database applications. However, it is not usually necessary to have different DBA staffs to perform the different roles. Indeed, intimate knowledge of how a database application was developed can make it easier to support that application once it becomes operational in the production world. But the bottom line is this: You will need to define, plan for, and staff both development and production DBA roles in order to create useful database applications.
Posted July 01, 2020
When non-DBAs think about what it is that a DBA does, performance monitoring and tuning are usually the first tasks that come to mind. This should not be surprising. Almost anyone who has come in contact with a computer has experienced some type of performance problem. Moreover, relational database systems have a notorious reputation (much of it undeserved) for poor performance.
Posted June 10, 2020
As a database consultant, one of the things I am most frequently asked about is dealing with database locking issues. Every DBMS has different settings and parameters to control locking at the system and database levels. And, yes, it is important to set these up correctly, but they are rarely the core cause of performance problems caused by locking. So let's focus on the universal issue that is almost always the culprit: poor coding techniques and inappropriate oversight before code is moved to production.
Posted May 13, 2020
Modern software development teams have adopted a continuous delivery approach based upon DevOps and agile development techniques. The small and frequent code changes that result from such methodologies can deliver significant benefit in terms of reduced lead time for changes, a lower failure rate, and a reduced mean time to recovery when errors are encountered. Indeed, today's developers migrate more frequent changes into the production environment than ever before.
Posted April 08, 2020
Ensuring the integrity of the organization's databases is a key component of the DBA's job. A database is of little use if the data it contains is inaccurate or if it cannot be accessed due to integrity problems. The DBA has many tools at his disposal to ensure data integrity. At a high-level, there are two aspects to data integrity.
Posted March 05, 2020
The mainframe has a long history of applications and uses, which stretches back more than 50 years. This means that some mainframe applications require modernization. This is often referred to as "mainframe modernization," but a better term for this is "mainframe application modernization" because the mainframe already is a modern platform.
Posted February 10, 2020
Application developers typically do not spend a lot of time on optimizing performance. Their focus is on writing code that matches the project specifications and delivers the expected results. So, it makes sense that SQL performance testing should be conducted on all programs before they are migrated to a production environment, or you will likely experience performance problems
Posted January 02, 2020
If you have been around the IT industry for as long as I have, you have seen technologies and ideas come and go—and sometimes even come back again. This is surely the case with the "new" products that call themselves data catalogs.
Posted December 01, 2019
Truly, the speed and performance of your production database systems encompasses a wide range of parameters and decisions that are made well before implementation. DBAs need to understand the options available, the factors that impact performance and development with each DBMS option, and work to keep the IT organization up-to-speed and educated on all of the available choices.
Posted October 31, 2019
A "best practice" is a professional procedure or tactic that is accepted or prescribed as being correct or most effective. Take a moment to think about the things that you do, or want to do, on a daily basis to manage your database infrastructure. What are the main categories of "things" that you want to tackle with best practices?
Posted October 01, 2019
Although adopting advanced analytics is on the radar for most organizations these days, it is important to understand some of the problems that can occur as you implement analytics projects. Perhaps the most important obstacle to overcome is ensuring buy-in from your organization's leaders.
Posted September 03, 2019
Unless you've been living in a cave somewhere, you've certainly heard news about stolen and hacked data, commonly referred to as data breaches. They occur with great regularity, and there is no indication that the frequency of data breaches is slowing down. Organizations must take more aggressive actions to prevent the sensitive information in their care from being stolen and abused.
Posted August 07, 2019
The latest release of Db2, version 11.5, unleashed on the world in June 2019, is being marketed by IBM as "The AI Database." AI promises to be one of the biggest technological game-changers of this century as it enables computing devices and systems to take on more activities that heretofore had to rely on a human. But, what does IBM mean by an AI database?
Posted July 18, 2019
What are the practices and procedures that you have found to be most helpful to automate in administering your databases? Yes, I know that automation has been a standard claim for most DBMS vendors, as well as third-party DBA tool vendors, for many years. But are you really anywhere closer to an "on demand," "lights-out," "24/7" database environment yet?
Posted June 10, 2019
It can be challenging for IT architects and executives to keep up with today's modern IT infrastructure. Homogeneous systems, common in the early days of computing, are almost non-existent today in the age of heterogeneous systems. It is de rigueur for Linux, Unix and Windows servers to be deployed throughout a modern IT infrastructure. And for larger shops, add in mainframes, too.
Posted May 01, 2019
One of the most important and rapidly changing areas of database management is security and protection. The major DBMS vendors have been adding security features and improving the way you can protect your precious corporate data. But it can be difficult to keep up with these features, so let's take a brief look at some of the more interesting and useful database security options.
Posted April 09, 2019
Scalability and elasticity are related, though they are different aspects of database availability. Both scalability and elasticity help to improve availability and performance when demand is changing, especially when changes are unpredictable.
Posted March 04, 2019
Every organization that manages data using a DBMS requires a database administration group to ensure the effective use and deployment of the company's databases. And since most modern organizations rely on database systems, they also need DBAs. That said, the discipline of database administration is not well-understood, nor is it universally practiced in a coherent and easily replicated manner.
Posted February 08, 2019
Everybody knows that a database management system, or DBMS, is the system software used to store, manage, and access enterprise data. But what is a data analytics platform? Well, it can be a lot of things, so make sure that you examine any "platform" your organization is considering very carefully. A data platform might comprise a DBMS, and it might not. It might be a tool to help you collect and analyze large data sets from various sources. It might be a series of analytics tools and interfaces for accessing data. Or it might be any one, or set of, offerings that manage, virtualize, index, secure, or otherwise work with your data.
Posted January 02, 2019
Managing the performance of database systems and applications is a significant job responsibility for DBAs. From a database perspective, there are three basic performance components that must be performed.
Posted December 04, 2018
Before the end of the decade, the number of connected objects is projected to expand greatly. According to several different analysts, the number of connected objects by 2020 could be as low as 26 billion or as high as 50 billion. But even the low end of that range is quite large. Indeed, connectedness is becoming commonplace and accepted across a wide spectrum of services and applications.
Posted November 01, 2018
When you mention big data and analytics, the first thing most people think of is Hadoop, Spark, and NoSQL. But are these newer technologies required for big data projects? What about the mainframe? Mainframes are not often mentioned in big data articles and advertising. But they should be.
Posted October 10, 2018
The current trend for software development teams is to adopt a continuous delivery approach based upon DevOps and agile development techniques. DevOps is relatively a new term, coined in 2009, defining software engineering practices that combine software development (Dev) and software operations (Ops). The general idea is not really all that new, but the adoption of agile techniques and modern tooling to automate software delivery is. The goal of DevOps is for developers and operations personnel to collaborate throughout the entire service lifecycle, from design through development and into production.
Posted September 04, 2018
We are living in the age of polyglot persistence, which really just means that it makes sense to store data using the technology that best matches the way the data will be used by applications. The age of trying to force everything into a relational DBMS is over, and we now have NoSQL, NewSQL, in-memory, and Hadoop-based offerings that are being used to store data. But you really should also be looking at the algorithmic approach offered by Ancelus Database.
Posted August 08, 2018
A Worldwide Data Ethics Council is something we greatly need. The council would focus on debating, crafting and proposing clearer regulations that dictate what is—and is not—ethical in terms of data collection, retention, and usage. Furthermore, it would communicate the message of being skeptical of everything and using caution before sharing anything with anyone. The council could also work on forms of data ethics education for schools and universities, as well as to educate the press and government officials. I mean, let's face it, after watching those U.S. Congressional hearings with Mark Zuckerberg I don't think any techies believe that government officials are prepared for the Information Age.
Posted July 02, 2018
With all of the data breaches and accusations of improper data usage in the news these days, the question of who owns data looms large. Understanding who owns which data is a complex question that can't be answered quickly or easily.
Posted June 01, 2018
The relational optimizer is a very complex component of the RDBMS that we too often take for granted. The optimization techniques of the major RDBMS products continue to be improved with every new release, too. Relational optimization has saved countless hours of work and, as long as we use it properly and code our applications with knowledge of what optimization can do, the RDBMS can be used for a wide variety of requirements and use cases. Don't lose sight of that as you wend your way through the hype out there regarding new types of database systems.
Posted May 08, 2018
Data lake is a newer IT term created for a new category of data store. But just what is a data lake? According to IBM, "a data lake is a storage repository that holds an enormous amount of raw or refined data in native format until it is accessed." That makes sense. I think the most important aspect of this definition is that data is stored in its "native format." The data is not manipulated or transformed in any meaningful way; it is simply stored and cataloged for future use.
Posted April 12, 2018
DBAs spend a great deal of time monitoring and managing the performance of database systems and applications. But it doesn't have to be such a large percentage of their time. Yes, changing data patterns, requirements, and time will always conspire together to create performance problems. But more can be done to avoid the up-front poor performance of most database applications.
Posted March 07, 2018
The term "big data" has been bandied about for a number of years now, to the point where it has been used so much that it is a part of IT culture. Hard to specifically define, yet everyone seems to have a good idea what is meant by it, big data is here to stay. And that is a good thing!
Posted February 01, 2018
Blockchain is a distributed, shared, permissioned ledger for recording transactions with consensus, provenance, immutability and finality. It is the technology that drives virtual currencies like Bitcoin. But its potential spans many more industries and use cases than just virtual currencies.
Posted January 02, 2018
The importance of data to today's modern world becomes more and more clear every day. Organizations are creating, storing, gathering, and managing more data than ever before. If you are reading this article, chances are, you will agree with this statement: "You are managing more data this year than you did last year … and your organization is planning to manage even more data next year."
Posted December 01, 2017
One of the prevailing conversations around database systems lately revolves around ACID support. But not everybody knows what is meant by the term ACID. Oh, perhaps they know that it involves how data integrity is maintained or that it impacts locking. And, at a high level many folks know that relational/SQL systems support ACID whereas that is not always the case for NoSQL database systems.
Posted November 01, 2017
What Type of DBA Are You?
Posted October 18, 2017
DBAs spend a lot of time on tuning application code and SQL statements to boost efficiency and optimize access. But SQL is only one aspect of database systems performance. It is also important for DBAs to devote time to tuning and optimizing the design, parameters, and physical construction of database objects, specifically tables and indexes, and the files in which their data is stored.
Posted September 07, 2017
Referential integrity (RI) is a method for ensuring the "correctness" of data within a DBMS. People tend to oversimplify RI, stating that it is merely the identification of relationships between relational tables. It is actually much more than this. RI embodies the integrity and usability of a relationship by establishing rules that govern that relationship.
Posted August 09, 2017
As every DBA should know, DBMS data is typically persisted using disk storage. So the data is stored on disk and when it is later read or modified it has to be accessed and changed on disk. To optimize these processes, most DBMSes use a cache, or buffer pool, to stage data in memory when it is accessed. By moving the data to memory subsequent accesses to the same data can be more efficient because disk I/O can be bypassed.
Posted July 05, 2017
When a failure occurs, the DBA must ascertain whether recovery is required. It is possible, though not very likely for active databases, that a failure does not impact the integrity of your data. Assuming that recovery is required, the DBA will need to determine what resources (backup copies and log files) are available and how best to perform the needed database recovery.
Posted June 01, 2017
As DBAs, we can get mired in the depths of performance tuning parameters and scripts, sometimes getting lost in all the details. It is a good idea to always have a set of goals and philosophies that you can lean on to keep you focused and working on the appropriate things. That is what I want to talk about in this month's DBA Corner column. Some high-level rules of thumb for achieving your DBMS-related performance goals and maintaining your sanity.
Posted May 05, 2017
Every good DBA understands that backing up their database data is a non-optional part of assuring data availability and integrity. As a DBA, you need to know the difference between a full image copy backup and an incremental image copy backup and implement the proper image copy backup strategy based on application needs and database activity.
Posted April 07, 2017
The heart of any relational database management system is the system catalog that documents the database objects and system settings being used. The system catalog offers a wealth of information about your DBMS. You can think of it as the knowledge base of every piece of data known to the system. For this reason, it is important that DBAs understand what is in the system catalog, as well as how to access and manipulate the information it contains.
Posted March 02, 2017
The role of the DBA is growing and becoming more complicated in the age of digital transformation. As the amount and type of information we store expands, DBAs must become versed in administering not just one type of DBMS (e.g., relational), but multiple types (document, key/value, wide column stores, and graph) and even non-DBMS data platforms (e.g., Hadoop). Furthermore, cloud computing can change the manner in which existing applications and databases operate.
Posted February 08, 2017
The industry is changing, the way that DBAs work is changing, and database systems are changing. We all need to come to grips with the fact that the way we worked in the past is no longer the way we work with today's modern database environment.
Posted January 03, 2017
The clear trend these days is to automate and enable computerized tasks to streamline and optimize administrative and maintenance tasks. Many database management tasks that today require oversight and handholding by DBAs can, over time, be turned over to intelligently automated software to manage. But automation is just the first step.
Posted December 01, 2016
Navicat: A Helpful Toolkit for DBAs
Posted November 02, 2016
Typically, most applications consist of both batch and online workloads. This is true even today, when most of our attention has turned to online and web-based interaction. Sure, online activities are the most obvious, in-your-face component of countless applications, but batch processing still drives many actions behind the scenes. This can include applying updates, processing reports, integrating input from multiple sources and locations, data extraction, database utility processing, and more.
Posted October 07, 2016
The current driving force for many IT projects is big data and analytics. So how will the job of DBA be impacted as their companies deploy big data analytics systems? The answer, is quite a bit, but don't forget everything you already know!
Posted September 02, 2016
Regulatory compliance is a critical aspect of the IT landscape these days, and the ability to audit database activities showing who did what to which data when is a specific requirement of many industry and governmental regulations. There are six primary methods that can be used to accomplish database auditing.
Posted August 04, 2016