How to take connections for Data access?

You need access to the data to work on it. For the workspace, there are some underlying Data Schemas. You can also create a workspace that allows to select multiple underlying Data Schemas. You can use or remove multiple Data Schemas like multi combo box, where 1, 2, 3, and 4, 5 are schemas underlying. When you work with the models, you can access the notebook to fetch data for all these Data Schemas and create some data frames out of it. That can be used for model reading or other purposes.

This happens in workspace of the sandbox where you are building a Notebook. The same Notebooks gets promoted to production workspace. Therefore, the workspace production has its own set of underlying Data Schemas. When you build the model with getting connection for the underlying Schema 1 and 2, and getting the data and building, it makes rules work and will not be affected if the same Notebooks gets promoted to production or deployment is cloned.

Therefore, the Notebook needs to run which should not be fetching this data because it will be working on any 1 and 2 Schemas.

To avoid this issue, you can use connection feature to connect with a schema. This is a wrapper function where you can specify which workspace you are connecting to.

You can enter the workspace details to get the connection and that starts fetching the data.

When you create the Notebook to production, a script runs to not to connect the workspace. This also uses overloaded methods. This method tells how to get the connection. Simple get connection gets the primary connection as first Data Schema which you are using without any overload.

The second connection gets an ID as the name the Data Source which you are using and for the current one will passes as get connection 1.

In the sandbox, this script looks for 1 and it creates a connection and moves to production.

It will again look for an equivalent 1 and tries to get a connection.

Therefore, whatever you select first, becomes the first Data Schema, Second Schema, Third Schema, therefore, Primary, Secondary, Tertiary and so on. You can also pass the number while getting the connection to get the first primary Data Schema as a secondary Data Schema. Therefore, when it runs in sandbox, it gets the Secondary Schema. When it runs in the production, it fetches a Secondary Data Schema of production.