Contents
Bastion VM |
The Virtual Machine which has access to Kubernetes API server |
Base Infra |
The base Infrastructure which needs to be created before deploying product Infrastructure |
Product Infra |
The Kubernetes infrastructure required for respective product |
SmartInstall |
The Holistic Solution for deploying SmartOps applications in Kubernetes infrastructure |
Base infra and infrastructure specific to ITOps available in the environment. Refer infrastructure creation document to get an understanding on how to create the infrastructure.
Installation engineer has access to Bastion VM (VM having visibility to the ITOps Kubernetes Cluster)
Bastion VM has ability to connect to K8s API server of ITOps (kubectl commands work.)
Access to the ITOps key vault is allowed for Installation Engineer and Bastion VM can connect to Key Vault (Configured via network firewall of Key vault)
The Release packages are stored in SharePoint location and in azure artifacts. Please follow below steps for downloading.
Primary Download location: Share point
Navigate to sharepoint location: https://ustglobal.sharepoint.com/teams/InnovationEngineering/Shared%20Documents/Forms/AllItems.aspx?viewid=f349a736%2D8a62%2D467f%2D8448%2D067be464bd59&id=%2Fteams%2FInnovationEngineering%2FShared%20Documents%2FKnowledge%20Management%2FSmartOps%20Deployment
Open the required release folder (eg:7.1.2)
Download the product zip and move it to the target VM
Secondary download location: Azure Artifacts
Prerequisite: Install az cli in the target deployment vm using below command:
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash |
Navigate https://dev.azure.com/USTInnovationEngineering/SmartOps/_packaging?_a=feed&feed=Smartops_Releases
Click on the required package.
Click on Versions
Click on the options button (…) and select ‘Copy Install Command’. The download command will get copied to clipboard.
Login to VM where you want to extract the package and execute the command.
Note: If its first time you will be prompted to install azure-devops extension. Give ‘Y’ and hit enter to continue.
Az MYSQL Firewall policies
Public network access should be denied.
Allow access to Azure MySQL by configuring firewall policies
Enforce SSL connection should be Enabled
Key vault permissions for the Azure AD user.
Access to key vault can be enabled by configuring Key Vault access policies.
Before proceeding with deployment, we need to validate
Access is enabled for Disk Encryption
Permission model set as Vault access policy
Respective applications and resources are added with required access under APPLICATION section
Azure service principal needs Get permission to Key Vault secrets.
Disk Encryption set, storage account, Azure MySQL instance needs GET, Wrap and Unwrap Key permissions. Please refer below example for disk encryption set.
User’s access can be enabled by Adding users with required set of permission to access key, secret and certificate which is listed under USER section
Private Endpoint’s IP associations
we have 4 private endpoints per environment. For these 4 private endpoints, we have 3 private DNZ zones. One private zone per Azure blob, Azure Key Vault and Azure MySQL instances.
There should be a private IP against each of these services. If this private IP is not associated to the respective private endpoints, application deployments will fail as the K8s cluster will not be able to communicate with these private endpoints. We must manually add it via Azure portal, if found not associated.
PFB screenshots for reference
Key Vault Private Endpoint and its Private IP Above Key Vault’s Private IP associated with respective Private Link of Key Vault
Update Private IP in Private Link
Please refer below example from product invoiceext to update the Private link when it is missing. Please check for required product itops accordingly.
Please refer below screenshot when private endpoint’s IP is not associated with the private link.
When private endpoint IPs are not listed, include the same by adding record set of respective resource.
When adding new recordset , update the instance name in Name field and add the private endpoint IP in the IP Address field.
If the recordset is present but the IP is not associated, then Click the record set (Here Azure Key Vault kv-invoiceext-re01)
Update the Private IP in the ‘IP address’ field
Node pools’ zone redundancy
Note: GPU pools does not have zone redundancy enabled since it is not supported in AKS.
Install Tools: script to install prerequisite packages in bastion VM
cd <package_path>/kubernetes/azure-setup/scripts
Set execution permission
chmod +x installbastiontools.sh
|
Execute Shell script
. / installbastiontools.sh
|
Connect to cluster (Kube config configured)
Select the Kubernetes cluster
Click ‘connect’ and follow the command provided in Azure portal
Please find below screenshots of the commands for reference on how to connect the Kubernetes cluster from Bastion VM.
Pre-Check condition: Please check Python 3.6 is installed in the bastion VM. [ SmartInstall runs on Python 3.6]
Python 3.6 installation link reference
Ubuntu 18.04 comes with Python 3.6 as default.
(Create DefArc Storage + Private End Point, and update Environment JSON.)
az storage account create \ |
Private end point creation steps
|
Once created go to resource group and verify the def archival storage account
Before starting the deployment, we must set secrets in Azure Key Vault using create-az-kv-secrets.sh script in kv-init folder with the updated values with respect to the product infrastructure created.
Before executing the create-az-kv-secrets.sh script, respective access policies must be set for the key vault for the Azure AD user.
Before executing the create-az-kv-secrets.sh script, Key Vault’s firewall policies should be allowed access from All networks. Please refer below screenshot.
Note : As an alternate White list the IP through f/w policy or just allow all n/w >
Execute secret creation script against the Key Vault
Azure Login from bastion VM
Once Signed in successfully, there will be message in the browser like below
Set the Subscription by below command
az account set - -subscription <sunscription_id>
After updating the create-az-kv-secrets.sh with latest values with respect to the product infrastructure, execute the script with using below syntax
./create-az-kv-secrets.sh <subscription_id> <key_vault_name> <namespace_name>
On Successful completion, all the required secrets will get be created in Azure Key Vault instance.
Once the script is executed successfully and validated the secrets created, switch back Key Vault’s firewall policy from ‘All networks’ to ‘Private endpoint and selected networks’ and save the change. Please refer below screenshot.
<NameSpace>-du-api-key (We can give dummy value here as DU is not deployed as part of ITOps
<NameSpace>-def-archival-filestore-secret-key (Access key of STORAGE_ACCOUNT_NAME created above)
<NameSpace>-itops-mongo-user (itopsadmin)
<NameSpace>-itops-mongo-password (itopsadmin123)
For def-archival-filestore-secret-key – GO to the def archival storage account created -go to access keys ---show keys ---and copy that first key value
And go to key vault and create the key as below and provide the secret value as the access key we copied from def archival storage account.
For itops-mongo-user (itopsadmin)
Create the secret with name – as shown below with secret value itopsadmin
For itops-mongo-password
Create the secret with name – as shown below with secret value itopsadmin123
As part of ITOps 2.0.2 release, clones stack has been changed from clones-engine-studio-sense-queue to clones-engine-studio-queue.
So ensure to uninstall the clones-engine-studio-sense-queue release before proceeding with the deployment using below command:
"helm uninstall -n itopsv1-stg01 --kube-context kc-smartops-itops-stg01 itopsv1-stg01-clones-engine-studio-sense-queue-rel" |
SmartInstall uses environment JSON file to install respective application.
Before deployment, environment JSON file needs to be updated as required.
A template of environment JSON is available in the package
Please refer below environment JSON file key values and its details
Keys |
Sub Keys |
|
|
Suggested Values |
Info |
name |
|
|
|
stg01 |
Name of the environment |
product |
|
|
|
itopstv1 |
Name of product which needs to be deployed. Json file name in products folder. |
version |
|
|
|
7.1.2 |
Helm Chart version |
dnsName |
|
|
|
|
DNS name of the environment |
includeIngress |
|
|
|
true |
Ingress needs to be deployed or not |
ingressIp |
|
|
|
|
IP of Ingress |
isPrivateIngress |
|
|
|
true |
For Private Kubernetes cluster, the internal traffic is through internal Kubernetes loadbalancer. |
|
|
|
|
|
Do not change this setting if not for a specific use case. |
gpuEnabled |
|
|
|
false |
For Kubernetes cluster which needs GPU node pools |
helmRepoLocation |
|
|
|
../charts |
Helm repo location. Either smartops-helm repo or the charts folder inside the package |
defaultAppReplicaCount |
|
|
|
2 |
Number for replicas of application containers |
secretProvider |
|
|
|
|
For managing kubernetes secrets |
|
azure |
|
|
|
Provider is Azure for K8s cluster deployed in Azure infrastructure |
|
tenantId |
|
|
|
Tenant ID of Azure subscription |
|
servicePrincipal |
|
|
|
Service principle client id and client secrets |
|
|
clientId |
|
|
|
|
|
clientSecret |
|
|
|
|
keyVaultName |
|
|
|
Azure keyvault name where the secrets are configured with its respective values |
autoScaling |
|
|
|
|
For critical applcation containers, autoscaling is enabled through kubernetes Horizontal Pod Autoscaler |
|
enabled |
|
|
true |
Set true to enable autoscaling for supported services. |
diskEncryption |
|
|
|
|
Encryption for Data at rest. |
|
enabled |
|
|
true |
|
|
azure |
|
|
|
Azure Disc Encryptionset ID. |
storage |
|
|
|
|
Details of various data stores. |
defarchivalFileStore |
|||||
azure |
|
|
Provider Azure |
||
|
|
storageAccount |
|
|
Storage account for def-archival-filestore |
|
mysql |
|
|
|
|
|
|
host |
|
|
Azure MySQL instance name |
|
|
port |
|
|
Port number |
|
|
backup |
|
|
|
|
|
|
enabled |
true |
|
|
|
|
schedule |
0 2 * * * |
|
|
redis |
|
|
|
|
|
|
host |
|
|
Azure Cache Redis instance name |
|
|
port |
|
|
Port number |
|
appFileStore |
|
|
|
|
|
|
azure |
|
|
Provider Azure |
|
|
storageAccount |
|
|
Storage account name for application files storage |
|
modelFileStore |
|
|
|
|
|
|
azure |
|
|
Provider Azure |
|
|
storageAccount |
|
|
Storage account where the pre-trained models are stored for various applications. |
|
backupFileStore |
|
|
|
|
|
|
azure |
|
|
Provider Azure |
|
|
storageAccount |
|
|
Storage account where backup files are stored |
|
mongo |
volumeSize |
|
|
Mongo instance details with the volume configuration, backup and its schedule. |
|
|
backup |
|
|
|
|
|
|
enabled |
true |
|
|
|
|
schedule |
0 2 * * * |
|
|
elasticsearch |
|
|
|
|
|
|
volumeSize |
|
|
Elasticsearch instance details with the volume configuration, backup and its schedule. |
|
|
backup |
|
|
|
|
|
|
enabled |
true |
|
|
|
|
schedule |
0 2 * * * |
|
|
rabbitmq |
|
|
|
|
|
|
volumeSize |
|
|
RabbitMQ instance details with the volume configuration, backup and its schedule. |
|
|
backup |
|
|
|
|
|
|
enabled |
true |
|
|
|
|
schedule |
0 2 * * * |
|
|
appStatefulSets |
|
|
|
Volume size configuration for application services which are statefulsets if any are there |
|
|
volumeSize |
|
16Gi |
|
|
|
|
|
|
|
|
|
|
|
|
|
logMonitoring |
|
|
|
|
Details for enabling log monitoring, log retention, cleanup and storage volume size. |
|
enabled |
|
|
true |
Recommended to set as true |
|
logRetentionInDays |
|
|
5 |
For logs before the configured number of days will be automatically removed as per the cleanup cron schedule. |
|
logCleanUpCronSchedule |
|
|
0 1 * * * |
Time duing which the retention job will run. |
|
|
|
|
|
|
|
logVolumeSize |
|
|
128Gi |
Immutable after first install. |
|
|
|
|
|
|
dataRestore |
databases |
|
|
|
This section applies only when smartinstall runs in restore mode. List of Data stores which needs to be restored |
|
mysqlBackupPath |
|
|
|
folder name inside Azure blob where mysql back up files are stored |
|
mysqlBackupFileName |
|
|
|
File name of mysql back up file |
|
mongoBackupPath |
|
|
|
folder name inside Azure blob where mongo back up files are stored |
|
mongoBackupFileName |
|
|
|
File name of mongo back up file |
|
elasticBasePath |
|
|
|
Path of Elasticsearch backup file in Azure blob |
|
minioBackupPath |
|
|
|
Folder name of Minio backup file in Azure blob |
|
rabbitmqBackupPath |
|
|
|
folder name inside Azure blob where RabbitMQ back up files are stored |
|
rabbitmqBackupFileName |
|
|
|
File name of RabbitMQ back up file |
|
restoreContainer |
|
|
|
Azure Blob container name where back up files are stored |
roleBasedAccess |
superAdmin" |
|
|
|
|
|
|
enabled |
|
false |
|
|
|
groupId |
|
NA |
|
|
productViewer |
|
|
|
|
|
|
enabled |
|
false |
|
|
|
groupId |
|
NA |
|
|
productAdmin |
|
|
|
|
|
|
enabled |
|
False |
|
|
|
group ID |
|
NA |
|
Once the packages are downloaded to bastion VM , execute below commands to Install the application without data restore
cd <package_path>/kubernetes/ smartinstall
Execute installWithDataInit.py
python3 -u installWithDataInit.py --product ${product} --env ${environment} --kubecontext ${kubecontext} --verbose
|
product – The application which needs to be deployed e.g. itops
env – The environment which the application needs to be deployed e.g dev , qa
kubecontext - The kubecontext of the product infrastructure
Please follow the steps mentioned in below video for SSL enablementhttps://web.microsoftstream.com/video/fc814048-9405-423d-adca-22d28ecc30bc?list=trending
Appendix consists of the following sections:
For enabling Data Encryption for Azure MySQL, Storage Accounts and enabling Disk Encryption for Volumes in Kubernetes cluster, we need to create Encryption keys in Azure Key vault which is used to encrypt the data.
Select the key vault and click ‘Keys’
Click ‘Generate/Import’
Set a Key name and click ‘Create’. Key Type and Key Size can be set with the default values unless there is a specific requirement.
Reference: https://docs.microsoft.com/en-us/azure/aks/azure-disk-customer-managed-keys
# Create a DiskEncryptionSet
# key vault name, rg, etc needs to be changed accordingly
#key key-smartops-k8s-disk-enc-001 ( key name given as an example ) needs to be created in Azure key Vault before creating the Disk Encryption Set
keyVaultId=$(az keyvault show --name kv-engg-resrch-001 --query [id] -o tsv)
keyVaultKeyUrl=$(az keyvault key show --vault-name kv-engg-resrch-001 --name key-smartops-k8s-disk-enc-001 --query [key.kid] -o tsv)
az disk-encryption-set create -n smartops-k8s-des-001 -l eastus -g rg-smartopsengg-dev-001 --source-vault $keyVaultId --key-url $keyVaultKeyUrl
Azure cloud shell:
Ensure Get, Wrap and Unwrap permission is set for the disk encryption set to the key created in Az key vault.
Please refer des-platform-qa01 in above pic
IMPORTANT: After creating the disk encryption set, select the disk encryption set and click on allow access to disk encryption key created in the key vault. PFB pic
K8s storage class
#currently kept as a part of env-setup template. Can be changed as required
#diskEncryptionSetID values needs to be changed accordingly ( subscriptions, resourceGroups, diskEncryptionSets)
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: pvc-ade-custom-storage-class
provisioner: kubernetes.io/azure-disk
parameters:
kind: Managed
skuname: Premium_LRS
diskEncryptionSetID: "/subscriptions/dfaa090f-c407-4e75-ac08-143cb932bdcf/resourceGroups/rg-smartopsengg-dev-001/providers/Microsoft.Compute/diskEncryptionSets/smartops-k8s-des-001"
After deploying storage class, respective changes need to be made in statefulset’s pvcs referring to above custom storage class.
References:
https://docs.microsoft.com/en-us/azure/mysql/howto-data-encryption-portal
https://docs.microsoft.com/en-us/azure/mysql/concepts-data-encryption-mysql
Data Encryption of Azure MySQL instance is done at server level using CMK ( customer-managed key)
Key Terminologies
Key Encryption Key [ KEK ]
Stored in Azure Key Vault
KEK is used to encrypt Data Encryption Key
Data Encryption Key [ DEK ]
Symmetric key used to encrypt a block of data
DEK, encrypted with KEK are stored separately
MySQL server needs below permissions on Azure Key Vault
get
wrapKey
UnwrapKey
Key Vault and Azure Database for MySQL must belong to same Azure AD
Enable soft delete feature on Key Vault instance
Key must be in ‘Enabled’ state
When keys are imported , only .pfx, .byok, .backup file formats are supported
If key vault generates the key , create a key backup before using for the first time.
Backup Azure Key Vault Key Reference
When you configure data encryption with a customer-managed key in Key Vault, continuous access to this key is required for the server to stay online. If the server loses access to the customer-managed key in Key Vault, the server begins denying all connections within 10 minutes. The server issues a corresponding error message and changes the server state to Inaccessible. Some of the reason why the server can reach this state are:
If we create a Point in Time Restore server for your Azure Database for MySQL, which has data encryption enabled, the newly created server will be in Inaccessible state. You can fix this through Azure portal or CLI.
If we create a read replica for your Azure Database for MySQL, which has data encryption enabled, the replica server will be in Inaccessible state. You can fix this through Azure portal or CLI.
If you delete the Key Vault, the Azure Database for MySQL will be unable to access the key and will move to Inaccessible state. Recover the Key Vault and revalidate the data encryption to make the server Available.
If we delete the key from the Key Vault, the Azure Database for MySQL will be unable to access the key and will move to Inaccessible state. Recover the Key and revalidate the data encryption to make the server Available.
If the key stored in the Azure Key Vault expires, the key will become invalid and the Azure Database for MySQL will transition into Inaccessible state. Extend the key expiry date using CLI and then revalidate the data encryption to make the server Available .
Limitations
Support for this functionality is limited to General Purpose and Memory Optimized pricing tiers.
This feature is only supported in regions and servers which support storage up to 16TB. For the list of Azure regions supporting storage up to 16TB, refer to the storage section in documentation here
Encryption is only supported with RSA 2048 cryptographic key.
Steps
Create Azure Database for MySQL instance. PFB a sample screenshot for the confgurations.
Click Data Encryption on the ‘Security’ section and click ‘Yes’ for ‘Use customer managed key’
After clicking ‘Yes’ for CMK. Select Key Vault and Key created in Azure Key Vault instance
Click Save
Restart Azure Database for MySQL
Errors Observed while configuring
If soft –delete is not enabled for keyvault , will get error like below
The above issue has been resolved when new keyvault instance created with soft delete enabled and enabling purge protection after key vault creation.
PFB screenshot after configuring CMK for enabling Data Encryption.
Steps
Select the storage account from respective resource group which needs to be encrypted and click on encryption
Click ‘Customer-managed keys’ radio button
Select the key vault and key by clicking ‘Select a key vault and key’
Select the key vault from the drop down list
Select the key from the key vault.
After selecting the key vault and key from the drop down list, click ‘Select’ button.
Click ‘Save’ to save the changes.
Please Note: If its first time you will be prompted to install azure-devops extension. Give ‘Y’ and hit enter to continue.
Container name |
CPU Threshold |
min replicas |
max replicas |
clones-engine |
80% |
2 |
4 |
correlation |
80% |
2 |
4 |
alertmapping |
80% |
2 |
4 |
K9s is installed when the installbastiontools.sh script is executed. Please refer
Staying in home directory execute below command to open K9s
K9s/k9s
Or
cd k9s
./k9s
As we have changed the artifacts reference from Archiva to jFrog-Artifactory for Core Platform, it is expected to uninstall 2 stacks deployed for smartops-archiva
Issues |
Remarks |
smartops-secrets stack failure |
Secrets not correctly updated in Azure Key Vault or smartops-secrets chart |
Restore failures |
1.All databases should be deployed and running in healthy state |
1. Login to SmartOps Master Admin UI, Revoke the offline tokens generated for all the organizations.
2. Navigate to Keycloak Administration console, do the below step(3) for all organizations except master.
3. Go to Users tab,
select sense_master user, go to Consents tab, check if there are any Offline Token entries, if yes, then click on Revoke button.
4. Uninstall keycloak services in K8.
5. Clear invalid offline token entries from DB, if any. Execute the below SQL scripts in MySQL.
TRUNCATE table keycloak.offline_client_session;
TRUNCATE table keycloak.offline_user_session;
6. Install keycloak services in K8.
7. Login to SmartOps Master Admin UI, Generate new offline token for all the organizations
8. Update keyvault with new token generated for USTGlobal
9. Restart secret vault sync
pod to reflect the new value
10. Confirm that pods are using new offline token.