Does deploying a physical schema onto NoSQL make sense

Michelle Knight

Does deploying a physical schema onto NoSQL make sense

Eric, in a previous thread about big data, mentioned an additional question. Does deploying a physical schema onto NoSQL make sense? Any thoughts here?

William Frank

RE: Does deploying a physical schema onto NoSQL make sense
(in response to Michelle Knight)

This depends on what you mean by "deploying" and what NoSQL database you are using.  

If you have no physical *datamodel*, how are programmers supposed to use the database, and what to expect in it? 

OTOH, there is no need to "deploy" the database FIRST, before the code that matches that database can execute.  In SQL databases, there must be a prior specification of the physical structure in a data description language, which must then be "deployed" and then a new version of the code that goes agaist that version of the physical database.    This model does not fit most NoSQL databases.  Instead, the database structure is usually defined in the code itself, such as the creation of collections in Mongo.   And further, when data element needs change, the code can insert documents in a collection that differ in whatever ways are necessary from the other documents in the collection.   Additional code might then go back and revise the structure of some or all of the documents already in the collection.  So, there is no *separate* deployment of the database and the code, and for object oriented languages, no impedence mismatch between the database and the code in a document oriented database. 

This is the sensible meaning of "Schemaless", that the schema can change.  As the very best practice, we can use any NoSQL database, say key-value, document, or graph, to model a domain and have a domain driven design, just a design that is not cast in stone.  Next down the practice chain, NoSQL physical models might be organized just to optimize certain queries.  But anyone who attempts to have no design for the structure of documents and their relationships to collections, but just code as it occurs, will need to be working only with a toy database.  

Michelle Knight

RE: Does deploying a physical schema onto NoSQL make sense
(in response to William Frank)

Thank you. I hear you say that there needs to be a model of what is outputted by the NoSQL database. The schema that is inputted is the entire program, or database itself, because the schema can change upon reading the data. Practically you can model the domain in the no-SQL database but the design can change. I will think about this a bit more. I appreciate it.

William Frank

RE: Does deploying a physical schema onto NoSQL make sense
(in response to Michelle Knight)

Exactly, Michelle.   In addition, not only may the schema for a given type of data 'record', such as a document, change, analogous to an individual Relational row changing its columns, thus creating alternative schemas for a given type of information, and not only can changes to the schema in previously added records be modified by the application software itself, in some NoSQL dbms, but also the *groupings* of records into types of things, such as Mongo "Collections" analogous to tables, can be changed and new ones added by the program, and even databases themselves may be created at run time by a program. 

To my mind, this makes the responsibility to manage the data model even more necessary, for chaos can result, since one does not literally 'need' to have an explicit datamodel.  It also makes the job more demading, since a "DBA" can't separately control the data model.   

The good news, to me: "data architects" and programmers can't live in silos that are often at odds any more, or even neessarily be different people. 

Michelle Knight

RE: Does deploying a physical schema onto NoSQL make sense
(in response to William Frank)

Hi William,

Yup, I agree with you, William. As I am thinking about the physical schema on NoSQL and in general, I realize that these are the requirements for a project, plan or whatever else. I mean that data architects model the entire data product. For No-SQL this mainly means what is outputted.  I see the data modeler or data architect working with the business folks and programmers to get a visual of what they want from the NoSQL systems. The resulting model, when approved, would become the requirements. The programmers would design to that model. Anyone doing data quality assurance would test to that data model and identify areas where business needs clarification or programmers need to fix the software. Do you see this?

Michelle Knight

[login to unmask email]

Freelance Production Assistant

Freelance Data, Technology and Science Writer

William Frank

RE: Does deploying a physical schema onto NoSQL make sense
(in response to Michelle Knight)

Yes, to the awareness that a data architecture is just as important with an NSQL database as with SQL.   And yes, the entire project's views of data, from input formats to storage to delivered data should be modeled.   

Yet there is a big difference with NSQL.  You don't have to 'get it right' before you can deliver a system; the software itself can restructure the data as needed *during run time*.  And, this is much more realistic, in terms of how things change.  So, while the data views are the core of 'requirements', it is not as if we need to do all the requirements first, and then the programmers program to them.  The resulting minimum viable product is then adjusted as the business discovers its needs, rather than trying to get business people to sit still and actually understand a data model. 

The entire programming team can participate in constructing and modifying the data models.  So, the division between data architects and programming blurs, there is no "once approved, and THEN the programmers start.   The whole idea of 1. doing a data model, 2. getting it approved by the business, and then 3. getting it programmed, as separate delivery phases, blurs.  Why insist on approval if it is easy to change?  Don't business people relate more to the actual output than they do to 'models'?   

This is the big change, that the data management community has to adjust to if it is to remain relevant.    Yet it makes conceptual data modeling for the enterprise = ontology even *more* important, because the conceptual model is not tied to *ANY* project that directly delivers a needed result to the business.   Instead, the ontology describes all the concepts of the business consistently,  so that the data models for individual projects can be built more quickly and with higher quality and fewer limitations for extensions over time, and so that the resulting data is easy to meld into an enterprise-wide data fabric.   

How to make this happen is the big question.    For, as others point out, long term results and savings and long term systems quality are often a hard sell to the business.

Michelle Knight

RE: Does deploying a physical schema onto NoSQL make sense
(in response to William Frank)

Ok, William, I can see that you can restructure data and that by rigidly focusing on approval runs into the same problems people had with a waterfall method of development. Agile Data Modeling promises to be more fluid to allow for changes.

At the same time, I think it is helpful to have a rough definition or standard of what to work towards, even if it is just a small piece or not very detailed. I think that this provides a place for business and technology to have an objective discussion. Even if that means delivering something hourly upon business discovering its needs.

Maybe working from a conceptual business ontology will make that easier. The tricky thing is that business ontologies may change depending on the situation or there may not be as much agreement on the details. Ontologies help view the whole business utilize NoSQL but can be rough around the edges depending on what is trying ot be achieved. I will have to think about that a bit.

Thanks WIllaim.

Michelle Knight

[login to unmask email]

Freelance Production Assistant

Freelance Data, Technology and Science Writer

William Frank

RE: Does deploying a physical schema onto NoSQL make sense
(in response to Michelle Knight)

Michelle, yes I agree with your views here.    The fact that we can do agile detailed project-oriented models makes it even more necessary to have an enterprise-level ontology, standards and some high-level agreements,m to tie those together. 

I think Frank T van Amsterdam says this most eloquently in his post at the same time about breaking up your model into smaller chunks.  So, I would say we are all of a common mind.   

I have one little quibble.  You said the ontology can change.  Certainly, the ontology *model* will change, which is why it needs to be done iteratively,    But the underlying subject matter language that the ontology model is intended to describe changes its shape very slowly.  Usually, at the edges, it is extended with new terms.  

Michelle Knight

RE: Does deploying a physical schema onto NoSQL make sense
(in response to William Frank)

Good point. Thank you.

Michelle Knight

[login to unmask email]

Freelance Production Assistant

Freelance Data, Technology and Science Writer