Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GMC: resource management #259

Merged
merged 43 commits into from
Aug 17, 2024
Merged

GMC: resource management #259

merged 43 commits into from
Aug 17, 2024

Conversation

KfreeZ
Copy link
Collaborator

@KfreeZ KfreeZ commented Aug 3, 2024

Description

This PR introduce the resource management in GMC controller in order to:

  • delete the resources when GMC pipeline is deleted
  • delete the resources if it is deleted from a pipeline
  • record more status details of the resources
  • update resource status based on event

Issues

#192
#193
#194

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)

Dependencies

List the newly introduced 3rd party dependency if exists.

Tests

  1. get status of deployment
  status:
    accessUrl: http://router-service.dataprep.svc.cluster.local:8080
    annotations:
      ConfigMap:v1:data-prep-config:dataprep: provisioned
      ConfigMap:v1:embedding-usvc-config:dataprep: provisioned
      ConfigMap:v1:llm-uservice-config:dataprep: provisioned
      ConfigMap:v1:reranking-usvc-config:dataprep: provisioned
      ConfigMap:v1:retriever-usvc-config:dataprep: provisioned
      ConfigMap:v1:tei-config:dataprep: provisioned
      ConfigMap:v1:teirerank-config:dataprep: provisioned
      ConfigMap:v1:tgi-config:dataprep: provisioned
      Deployment:apps/v1:data-prep-svc-deployment:dataprep: |
        Replicas: 1 desired | 1 updated | 2 total | 1 available | 1 unavailable
        Conditions:
          Type: Available
          Status: True
          Reason: MinimumReplicasAvailable
          Message: Deployment has minimum availability.
          Type: Progressing
          Status: True
          Reason: ReplicaSetUpdated
          Message: ReplicaSet "data-prep-svc-deployment-86f698cd8c" is progressing.
      Deployment:apps/v1:embedding-svc-deployment:dataprep: |
        Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
        Conditions:
          Type: Available
          Status: True
          Reason: MinimumReplicasAvailable
          Message: Deployment has minimum availability.
          Type: Progressing
          Status: True
          Reason: NewReplicaSetAvailable
          Message: ReplicaSet "embedding-svc-deployment-858b75bbd" has successfully progressed.
      Deployment:apps/v1:llm-svc-deployment:dataprep: |
        Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
        Conditions:
          Type: Available
          Status: True
          Reason: MinimumReplicasAvailable
          Message: Deployment has minimum availability.
          Type: Progressing
          Status: True
          Reason: NewReplicaSetAvailable
          Message: ReplicaSet "llm-svc-deployment-649b7fbd7" has successfully progressed.
      Deployment:apps/v1:redis-vector-db-deployment:dataprep: |
        Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
        Conditions:
          Type: Available
          Status: True
          Reason: MinimumReplicasAvailable
          Message: Deployment has minimum availability.
          Type: Progressing
          Status: True
          Reason: NewReplicaSetAvailable
          Message: ReplicaSet "redis-vector-db-deployment-587844d666" has successfully progressed.
      Deployment:apps/v1:reranking-svc-deployment:dataprep: |
        Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
        Conditions:
          Type: Available
          Status: True
          Reason: MinimumReplicasAvailable
          Message: Deployment has minimum availability.
          Type: Progressing
          Status: True
          Reason: NewReplicaSetAvailable
          Message: ReplicaSet "reranking-svc-deployment-7745c7cf98" has successfully progressed.
      Deployment:apps/v1:retriever-svc-deployment:dataprep: |
        Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
        Conditions:
          Type: Available
          Status: True
          Reason: MinimumReplicasAvailable
          Message: Deployment has minimum availability.
          Type: Progressing
          Status: True
          Reason: NewReplicaSetAvailable
          Message: ReplicaSet "retriever-svc-deployment-795ccd4f84" has successfully progressed.
      Deployment:apps/v1:router-service-deployment:dataprep: |
        Replicas: 1 desired | 1 updated | 2 total | 1 available | 1 unavailable
        Conditions:
          Type: Available
          Status: True
          Reason: MinimumReplicasAvailable
          Message: Deployment has minimum availability.
          Type: Progressing
          Status: True
          Reason: ReplicaSetUpdated
          Message: ReplicaSet "router-service-deployment-6dcc4bc568" is progressing.
      Deployment:apps/v1:tei-embedding-svc-deployment:dataprep: |
        Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
        Conditions:
          Type: Available
          Status: True
          Reason: MinimumReplicasAvailable
          Message: Deployment has minimum availability.
          Type: Progressing
          Status: True
          Reason: NewReplicaSetAvailable
          Message: ReplicaSet "tei-embedding-svc-deployment-54b58d57cb" has successfully progressed.
      Deployment:apps/v1:tei-reranking-svc-deployment:dataprep: |
        Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
        Conditions:
          Type: Available
          Status: True
          Reason: MinimumReplicasAvailable
          Message: Deployment has minimum availability.
          Type: Progressing
          Status: True
          Reason: NewReplicaSetAvailable
          Message: ReplicaSet "tei-reranking-svc-deployment-54c5dd5795" has successfully progressed.
      Deployment:apps/v1:tgi-service-m-deployment:dataprep: |
        Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
        Conditions:
          Type: Available
          Status: True
          Reason: MinimumReplicasAvailable
          Message: Deployment has minimum availability.
          Type: Progressing
          Status: True
          Reason: NewReplicaSetAvailable
          Message: ReplicaSet "tgi-service-m-deployment-5fcff459f5" has successfully progressed.
      Service:v1:data-prep-svc:dataprep: http://data-prep-svc.dataprep.svc.cluster.local:6007/v1/dataprep
      Service:v1:embedding-svc:dataprep: http://embedding-svc.dataprep.svc.cluster.local:6000/v1/embeddings
      Service:v1:llm-svc:dataprep: http://llm-svc.dataprep.svc.cluster.local:9000/v1/chat/completions
      Service:v1:redis-vector-db:dataprep: http://redis-vector-db.dataprep.svc.cluster.local:6379
      Service:v1:reranking-svc:dataprep: http://reranking-svc.dataprep.svc.cluster.local:8000/v1/reranking
      Service:v1:retriever-svc:dataprep: http://retriever-svc.dataprep.svc.cluster.local:7000/v1/retrieval
      Service:v1:router-service:dataprep: http://router-service.dataprep.svc.cluster.local:8080
      Service:v1:tei-embedding-svc:dataprep: http://tei-embedding-svc.dataprep.svc.cluster.local:80
      Service:v1:tei-reranking-svc:dataprep: http://tei-reranking-svc.dataprep.svc.cluster.local:80/rerank
      Service:v1:tgi-service-m:dataprep: http://tgi-service-m.dataprep.svc.cluster.local:80/generate
  1. delete gmc pipeline
spec changed false | meta changed: true
Reconciling connector graph apiVersion gmc.opea.io/v1alpha3 graph codegen
delete resource Service tgi-service codegen
Success to delete Service tgi-service codegen
delete resource ConfigMap llm-uservice-config codegen
Success to delete ConfigMap llm-uservice-config codegen
delete resource ConfigMap tgi-config codegen
Success to delete ConfigMap tgi-config codegen
delete resource Deployment llm-service-deployment codegen
Success to delete Deployment llm-service-deployment codegen
delete resource Deployment router-service-deployment codegen
Success to delete Deployment router-service-deployment codegen
delete resource Deployment tgi-service-deployment codegen
Success to delete Deployment tgi-service-deployment codegen
delete resource Service llm-service codegen
Success to delete Service llm-service codegen
delete resource Service router-service codegen
Success to delete Service router-service codegen
root@cis-gms-worker-3:~/zkf/repo/kfreez/GenAIInfra# kubectl delete gmc -n codegen codegen 
gmconnector.gmc.opea.io "codegen" deleted
root@cis-gms-worker-3:~/zkf/repo/kfreez/GenAIInfra# 
root@cis-gms-worker-3:~/zkf/repo/kfreez/GenAIInfra# 
root@cis-gms-worker-3:~/zkf/repo/kfreez/GenAIInfra# kubectl get deployment -n codegen 
No resources found in codegen namespace.
root@cis-gms-worker-3:~/zkf/repo/kfreez/GenAIInfra# kubectl get service -n codegen 
No resources found in codegen namespace.
root@cis-gms-worker-3:~/zkf/repo/kfreez/GenAIInfra# kubectl get cm -n codegen 
NAME               DATA   AGE
  1. delete delta config
    deploy a codegen example
root@cis-gms-worker-3:~/zkf/repo/kfreez/GenAIInfra# kubectl -n codegen apply -f microservices-connector/config/samples/codegen_xeon.yaml
gmconnector.gmc.opea.io/codegen configured
root@cis-gms-worker-3:~/zkf/repo/kfreez/GenAIInfra# kubectl get deployment -n codegen
NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
llm-service-deployment      1/1     1            1           77m
router-service-deployment   1/1     1            1           77m
tgi-service-deployment      0/1     1            0           77m

remove the llm step

root@cis-gms-worker-3:~/zkf/repo/kfreez/GenAIInfra# kubectl get gmc codegen -n codegen -o yaml
apiVersion: gmc.opea.io/v1alpha3
kind: GMConnector
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"gmc.opea.io/v1alpha3","kind":"GMConnector","metadata":{"annotations":{},"labels":{"app.kubernetes.io/managed-by":"kustomize","app.kubernetes.io/name":"gmconnector","gmc/platform":"xeon"},"name":"codegen","namespace":"codegen"},"spec":{"nodes":{"root":{"routerType":"Sequence","steps":[{"data":"$response","internalService":{"config":{"TGI_LLM_ENDPOINT":"tgi-service","endpoint":"/v1/chat/completions"},"serviceName":"llm-service"},"name":"Llm"},{"internalService":{"config":{"MODEL_ID":"ise-uiuc/Magicoder-S-DS-6.7B","endpoint":"/generate"},"isDownstreamService":true,"serviceName":"tgi-service"},"name":"Tgi"}]}},"routerConfig":{"name":"router","serviceName":"router-service"}}}
  creationTimestamp: "2024-08-05T01:44:52Z"
  finalizers:
  - gmcFinalizer
  generation: 7
  labels:
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/name: gmconnector
    gmc/platform: xeon
  name: codegen
  namespace: codegen
  resourceVersion: "10108466"
  uid: 4afaedd2-dcdf-4fcd-94b9-6372e561cf6b
spec:
  nodes:
    root:
      routerType: Sequence
      steps:
      - internalService:
          config:
            MODEL_ID: ise-uiuc/Magicoder-S-DS-6.7B
            endpoint: /generate
          isDownstreamService: true
          serviceName: tgi-service
        name: Tgi
  routerConfig:
    name: router
    nameSpace: ""
    serviceName: router-service
status:
  accessUrl: http://router-service.codegen.svc.cluster.local:8080
  annotations:
    ConfigMap:v1:tgi-config:codegen: provisioned
    Deployment:apps/v1:router-service-deployment:codegen: "Status:True\n\tReason:NewReplicaSetAvailable\n\tMessageReplicaSet
      \"router-service-deployment-9cd877b6b\" has successfully progressed."
    Deployment:apps/v1:tgi-service-deployment:codegen: "Status:False\n\tReason:ProgressDeadlineExceeded\n\tMessageReplicaSet
      \"tgi-service-deployment-95dfc4575\" has timed out progressing."
    Service:v1:router-service:codegen: http://router-service.codegen.svc.cluster.local:8080
    Service:v1:tgi-service:codegen: http://tgi-service.codegen.svc.cluster.local:80/generate
  condition: {}
  status: 1/0/2

controller log:

spec changed true | meta changed: false
Reconciling connector graph apiVersion gmc.opea.io/v1alpha3 graph codegen

reconcile resource for node: Tgi
trying to reconcile internal service [ tgi-service ] in namespace  
get resource config: {Tgi { {tgi-service  map[MODEL_ID:ise-uiuc/Magicoder-S-DS-6.7B endpoint:/generate] true} }    }
The raw yaml file has been split into 4 yaml files
Success to reconcile ConfigMap: tgi-config
Success to reconcile Service: tgi-service
Success to reconcile Deployment: tgi-service-deployment
the service URL is: http://tgi-service.codegen.svc.cluster.local:80/generate

user config {codegen router-service router-service-deployment    '{"nodes":{"root":{"routerType":"Sequence","steps":[{"name":"Tgi","internalService":{"serviceName":"tgi-service","config":{"MODEL_ID":"ise-uiuc/Magicoder-S-DS-6.7B","endpoint":"/generate"},"isDownstreamService":true}}]}},"routerConfig":{"name":"router","serviceName":"router-service","nameSpace":"","config":null}}'}
Success to reconcile Deployment: router-service-deployment
Success to reconcile Service: router-service
the router URL is: http://router-service.codegen.svc.cluster.local:8080

spec changed false | meta changed: false
Success to delete Service:v1:llm-service:codegen
Success to delete ConfigMap:v1:llm-uservice-config:codegen
Success to delete Deployment:apps/v1:llm-service-deployment:codegen

result:

root@cis-gms-worker-3:~/zkf/repo/kfreez/GenAIInfra# kubectl get deployment -n codegen
NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
router-service-deployment   1/1     1            1           77m
tgi-service-deployment      0/1     1            0           77m
  1. update based on resource status changing

only router service is up when pipeline is deployed

root@cis-gms-worker-3:~# kubectl get gmc -n codegen
NAME      URL                                                    STATUS   AGE
codegen   http://router-service.codegen.svc.cluster.local:8080   1/0/3    28s
root@cis-gms-worker-3:~#
root@cis-gms-worker-3:~# kubectl get deployment -n codegen
NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
llm-service-deployment      0/1     1            0           34s
router-service-deployment   1/1     1            1           33s
tgi-service-deployment      0/1     1            0           33s

llm service is up after 6 min, the event triggers the update

| codegen:llm-service-deployment: status changed from : False to True|
-----------------Reconciling GMConnector namespace codegen name codegen -------------------------
Reconciling kind  GMConnector apiVersion  gmc.opea.io/v1alpha3  graph  codegen

reconcile resource for node: Llm
trying to reconcile internal service [ llm-service ] in namespace  
get resource config: {Llm { {llm-service  map[TGI_LLM_ENDPOINT:tgi-service endpoint:/v1/chat/completions] false} } $response   }
The raw yaml file has been split into 4 yaml files
Success to reconcile ConfigMap: llm-uservice-config
Success to reconcile Service: llm-service
find downstream service for Llm with name tgi-service 
Success to reconcile Deployment: llm-service-deployment
the service URL is: http://llm-service.codegen.svc.cluster.local:9000/v1/chat/completions

reconcile resource for node: Tgi
trying to reconcile internal service [ tgi-service ] in namespace  
get resource config: {Tgi { {tgi-service  map[MODEL_ID:ise-uiuc/Magicoder-S-DS-6.7B endpoint:/generate] true} }    }
The raw yaml file has been split into 4 yaml files
Success to reconcile ConfigMap: tgi-config
Success to reconcile Service: tgi-service
Success to reconcile Deployment: tgi-service-deployment
the service URL is: http://tgi-service.codegen.svc.cluster.local:80/generate

user config {codegen router-service router-service-deployment    '{"metadata":{"creationTimestamp":null},"spec":{"nodes":{"root":{"routerType":"Sequence","steps":[{"name":"Llm","internalService":{"serviceName":"llm-service","config":{"TGI_LLM_ENDPOINT":"tgi-service","endpoint":"/v1/chat/completions"}},"data":"$response","serviceUrl":"http://llm-service.codegen.svc.cluster.local:9000/v1/chat/completions"},{"name":"Tgi","internalService":{"serviceName":"tgi-service","config":{"MODEL_ID":"ise-uiuc/Magicoder-S-DS-6.7B","endpoint":"/generate"},"isDownstreamService":true},"serviceUrl":"http://tgi-service.codegen.svc.cluster.local:80/generate"}]}},"routerConfig":{"name":"router","serviceName":"router-service","nameSpace":"","config":null}},"status":{"condition":{"lastUpdateTime":null}}}'}
Success to reconcile Deployment: router-service-deployment
Success to reconcile Service: router-service
the router URL is: http://router-service.codegen.svc.cluster.local:8080

| spec changed false | meta changed: false |

the gmc's status is updated accordingly

root@cis-gms-worker-3:~# kubectl get pods -n codegen
NAME                                         READY   STATUS              RESTARTS   AGE
llm-service-deployment-6f45f9cf64-rqmsn      1/1     Running             0          6m46s
router-service-deployment-559645d584-vhws8   1/1     Running             0          6m45s
tgi-service-deployment-855cb977c6-hh9jm      0/1     ContainerCreating   0          6m45s
root@cis-gms-worker-3:~# kubectl get gmc -n codegen
NAME      URL                                                    STATUS   AGE
codegen   http://router-service.codegen.svc.cluster.local:8080   2/0/3    6m50s
root@cis-gms-worker-3:~# kubectl get gmc -n codegen -o yaml
apiVersion: v1
items:
- apiVersion: gmc.opea.io/v1alpha3
  kind: GMConnector
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"gmc.opea.io/v1alpha3","kind":"GMConnector","metadata":{"annotations":{},"labels":{"app.kubernetes.io/managed-by":"kustomize","app.kubernetes.io/name":"gmconnector","gmc/platform":"xeon"},"name":"codegen","namespace":"codegen"},"spec":{"nodes":{"root":{"routerType":"Sequence","steps":[{"data":"$response","internalService":{"config":{"TGI_LLM_ENDPOINT":"tgi-service","endpoint":"/v1/chat/completions"},"serviceName":"llm-service"},"name":"Llm"},{"internalService":{"config":{"MODEL_ID":"ise-uiuc/Magicoder-S-DS-6.7B","endpoint":"/generate"},"isDownstreamService":true,"serviceName":"tgi-service"},"name":"Tgi"}]}},"routerConfig":{"name":"router","serviceName":"router-service"}}}
    creationTimestamp: "2024-08-14T13:18:07Z"
    generation: 1
    labels:
      app.kubernetes.io/managed-by: kustomize
      app.kubernetes.io/name: gmconnector
      gmc/platform: xeon
    name: codegen
    namespace: codegen
    resourceVersion: "11597342"
    uid: c7486905-bc33-4b64-a558-2097b48b6187
  spec:
    nodes:
      root:
        routerType: Sequence
        steps:
        - data: $response
          internalService:
            config:
              TGI_LLM_ENDPOINT: tgi-service
              endpoint: /v1/chat/completions
            serviceName: llm-service
          name: Llm
        - internalService:
            config:
              MODEL_ID: ise-uiuc/Magicoder-S-DS-6.7B
              endpoint: /generate
            isDownstreamService: true
            serviceName: tgi-service
          name: Tgi
    routerConfig:
      name: router
      serviceName: router-service
  status:
    accessUrl: http://router-service.codegen.svc.cluster.local:8080
    annotations:
      ConfigMap:v1:llm-uservice-config:codegen: provisioned
      ConfigMap:v1:tgi-config:codegen: provisioned
      Deployment:apps/v1:llm-service-deployment:codegen: |
        Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
        Conditions:
          Type: Available
          Status: True
          Reason: MinimumReplicasAvailable
          Message: Deployment has minimum availability.
          Type: Progressing
          Status: True
          Reason: NewReplicaSetAvailable
          Message: ReplicaSet "llm-service-deployment-6f45f9cf64" has successfully progressed.
      Deployment:apps/v1:router-service-deployment:codegen: |
        Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
        Conditions:
          Type: Available
          Status: True
          Reason: MinimumReplicasAvailable
          Message: Deployment has minimum availability.
          Type: Progressing
          Status: True
          Reason: NewReplicaSetAvailable
          Message: ReplicaSet "router-service-deployment-559645d584" has successfully progressed.
      Deployment:apps/v1:tgi-service-deployment:codegen: |
        Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
        Conditions:
          Type: Available
          Status: False
          Reason: MinimumReplicasUnavailable
          Message: Deployment does not have minimum availability.
          Type: Progressing
          Status: True
          Reason: ReplicaSetUpdated
          Message: ReplicaSet "tgi-service-deployment-855cb977c6" is progressing.
      Service:v1:llm-service:codegen: http://llm-service.codegen.svc.cluster.local:9000/v1/chat/completions
      Service:v1:router-service:codegen: http://router-service.codegen.svc.cluster.local:8080
      Service:v1:tgi-service:codegen: http://tgi-service.codegen.svc.cluster.local:80/generate
    condition: {}
    status: 2/0/3
kind: List
metadata:
  resourceVersion: ""
root@cis-gms-worker-3:~#

Signed-off-by: KfreeZ <kefei.zhang@intel.com>
@KfreeZ KfreeZ changed the title initial commit for resource manangement feature GMC: resource management Aug 3, 2024
@KfreeZ KfreeZ marked this pull request as draft August 3, 2024 14:19
@KfreeZ KfreeZ changed the title GMC: resource management [WIP] GMC: resource management Aug 3, 2024
KfreeZ added 8 commits August 4, 2024 17:26
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
var success uint = 0
// graph.SetFinalizers(append(graph.GetFinalizers(), fmt.Sprintf("%s-.-%s-.-%s", obj.GetKind(), obj.GetNamespace(), obj.GetName())))
// save the resource name into annotation for status update and resource management
graph.Status.Annotations[fmt.Sprintf("%s:%s:%s:%s", obj.GetKind(), obj.GetAPIVersion(), obj.GetName(), obj.GetNamespace())] = "provisioned"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @KfreeZ, is there any possible to simply the Annotations to obj.GetName():obj.GetNamespace()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not, the ApiVersion is needed when I delete this resource.

// save the resource name into annotation for status update and resource management
graph.Status.Annotations[fmt.Sprintf("%s:%s:%s:%s", obj.GetKind(), obj.GetAPIVersion(), obj.GetName(), obj.GetNamespace())] = "provisioned"

if obj.GetKind() == Service {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @KfreeZ, is it more simply if we can separately handle Service and Deployment?

KfreeZ added 8 commits August 8, 2024 16:01
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
@KfreeZ KfreeZ marked this pull request as ready for review August 12, 2024 08:21
@KfreeZ KfreeZ changed the title [WIP] GMC: resource management GMC: resource management Aug 12, 2024
@@ -153,6 +158,16 @@ func reconcileResource(ctx context.Context, client client.Client, graphNs string
obj.SetNamespace(ns)
}

// set the owner reference for auto deleting the resources when GMC is deleted
obj.SetOwnerReferences([]metav1.OwnerReference{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ownerRef can only clean up reousrces in the same namespace, Let's decide how we want to handle this.

@@ -643,5 +789,7 @@ func (r *GMConnectorReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&mcv1alpha3.GMConnector{}).
WithEventFilter(ignoreStatusUpdatePredicate). // Use the predicate here
Owns(&appsv1.Deployment{}).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suppose we can also get service and deployment update and we can Update Status in annotation as well

@KfreeZ KfreeZ mentioned this pull request Aug 15, 2024
3 tasks
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
@KfreeZ
Copy link
Collaborator Author

KfreeZ commented Aug 15, 2024

@irisdingbj @zhlsunshine this PR is ready for review

Copy link
Collaborator

@irisdingbj irisdingbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also need to add e2e cases to cover the different scenarios supported in this PR.

@@ -66,6 +72,7 @@ const (
SpeechT5Gaudi = "SpeechT5Gaudi"
Whisper = "Whisper"
WhisperGaudi = "WhisperGaudi"
gmcFinalizer = "gmcFinalizer"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is useless now?

@@ -212,13 +221,28 @@ func reconcileResource(ctx context.Context, client client.Client, graphNs string
}
}

// // although apply a same config to k8s is fine, but we don't want to do it
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we applying the same config or not? this comments are confusing

}
}

func (r *GMConnectorReconciler) collectResourceStatus(graph *mcv1alpha3.GMConnector, ctx context.Context, externalServiceCnt int) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to change this to collectDeploymentStatus since this collects for deployment only

func reconcileRouterService(ctx context.Context, client client.Client, graph *mcv1alpha3.GMConnector) error {
routerService := &corev1.Service{}
jsonBytes, err := json.Marshal(graph)
func (r *GMConnectorReconciler) reconcileRouterService(ctx context.Context, client client.Client, graph *mcv1alpha3.GMConnector) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @KfreeZ, I notice that there is client.Client in r, it's unnecessary to pass client into this function.

KfreeZ added 4 commits August 16, 2024 12:01
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
KfreeZ added 16 commits August 16, 2024 19:20
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Signed-off-by: KfreeZ <kefei.zhang@intel.com>
Copy link
Collaborator

@zhlsunshine zhlsunshine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can merge these code into main branch firstly, then @KfreeZ please fix the bugs and comments in extra PRs.

@KfreeZ KfreeZ merged commit 81060ab into opea-project:main Aug 17, 2024
9 checks passed
@KfreeZ KfreeZ deleted the deleteResource branch August 30, 2024 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants