Technical Requirements

Objectives

  • Managed connector to enable users to experience data sharing with usage control

    • User should be able to sign up and create a connector on the server managed by DG

    • Usage control could be simple - an application has access to the raw data

    • Data could be any dummy data

Chain of thoughts during discussion

  • current process of self-hosted: go to server (SSH) -> run participant software installer → generate UI → fires up connector → run connectors → data exchange of CSV files or json

  • One server (EC2 instance) could be used for multiple connectors hosted for multiple/single users

    • Test the capacity of instance and configuration for a limit on connectors that could be hosted

    • Experiment on possibilities and limitations of this idea

      • Take 4 GB of RAM and observe number of connectors

    • Technologies to be used:

      • Docker containers with or without Kubernetes

      • [ OR ]

      • Kubernetes only

  • Limit of connectors per user

    • Have to decide on a number?

  • Server (EC2 instance) will be destroyed after all connectors running inside are stopped

    • Should be automated

  • Multi-threaded creation of connectors

    • How should it be managed?

    • Technology?

    • System Design?

  • Timeline of each connector

    • Destroy the connector after its time is over

  • Expanding capacity using Lambda

    • Need more info?

  • Celery worker to observe/ manage visualization

  • Finalize on class of machines for AWS instance

 

Working Pieces Readily Available

  • setup.py that does all of the above (however, have to experiment for scalablity) (link)

  • Loom video by sagar (link)

  • Video by Waseem (link)

 

Action Items

Action

Description

Owner

Due date

Jira ticket

Action

Description

Owner

Due date

Jira ticket

1
Test server for capacity of connectors it can run

Jun 16, 2021

2
Experiment with Docker containers to run multiple containers

 

 

 

 

3
Check Kubernetes can solve the problem of managing multiple connectors on a server/ multiple servers

 

 

 

 

 

References and documentation

  • A Complete Primer on Terraform: link

  • A Complete Tutorial on Ansible: link