際際滷

際際滷Share a Scribd company logo
Multitenant applications: how
and why
@xima
Multitenant applications: How and Why
Who am I?
 Filipe Ximenes
 Recife / Brazil
 Aussie for 1 year
(2008 - 2009)
Multitenant applications: How and Why
vinta.com.br/playbook
FLOSS
Django React boilerplate
https://github.com/vintasoftware/django-react-boilerplate
Django Role Permissions
https://github.com/vintasoftware/django-role-permissions
Tapioca
https://github.com/vintasoftware/tapioca-wrapper
Context
Corporate
Fidget Spinner
Tracking
Multitenant applications: How and Why
Multitenant applications: How and Why
Multitenant applications: How and Why
"How do you protect our data?"
What is Multitenancy
"... refers to a software
architecture in which a single
instance of software runs on a
server and serves multiple
tenants."
- Wikipedia
What we want to achieve?
 Reduce infrastructure costs by sharing hardware resources
 Simplify software maintenance by keeping a single code base
 Simplify infrastructure maintenance by having fewer nodes
Single Shared Schema
[or how the big guys do it]
Multitenant applications: How and Why
"Talk is cheap..."
Routing - ibm.spinnertracking.com
def tenant_middleware(get_response):
def middleware(request):
host = request.get_host().split(':')[0]
subdomain = host.split('.')[0]
try:
customer = Customer.objects.get(name=subdomain)
except Customer.DoesNotExist:
customer = None
request.customer = customer
response = get_response(request)
return response
return middleware
Querying
avg_duration = (
Spin.objects
.filter(user_spinner__user__customer=request.customer)
.aggregate(avg=Avg('duration')))['avg']
Multitenant applications: How and Why
Simpler querying
avg_duration = (
Spin.objects
.filter(customer=request.customer)
.aggregate(avg=Avg('duration')))['avg']
Case study: Salesforce
 1:5000 ratio
 Double checking
 Transparent to developers
Drawbacks
 Guaranteeing isolation is hard
 Might lead to complexity to the codebase
 3rd party library integration
Multiple databases
Multitenant applications: How and Why
Routing
DATABASES = {
'default': {
'ENGINE': ...,
'NAME': ...,
},
'ibm': {
'ENGINE': ...,
'NAME': ...,
}
}
The `.using()` approach
spinners = (
Spinner.objects
.using(request.customer.name)
.annotate(
avg_duration=Avg('owned_spinners__spins__duration'))
.order_by('-avg_duration'))
The threadlocal middleware approach
def multidb_middleware(get_response):
def middleware(request):
subdomain = get_subdomain(request)
customer = get_customer(subdomain)
request.customer = customer
@thread_local(using_db=customer.name)
def execute_request(request):
return get_response(request)
response = execute_request(request)
return response
return middleware
The router
class TenantRouter(object):
def db_for_read(self, model, **hints):
return get_thread_local('using_db', 'default')
def db_for_write(self, model, **hints):
return get_thread_local('using_db', 'default')
# 
# settings.py
DATABASE_ROUTERS = ['multitenancy.routers.TenantRouter']
Querying
spinners = (
Spinner.objects
.using(request.customer.name)
.annotate(
avg_duration=Avg('owned_spinners__spins__duration'))
.order_by('-avg_duration'))
Database Multitenancy
vs.
Application Multitenancy
Single Database
Multiple Schemas
Multitenant applications: How and Why
What are schemas in the first place?
SELECT id, name FROM user
WHERE user.name LIKE 'F%';
What are schemas in the first place?
CREATE SCHEMA ibm;
SELECT id, name FROM ibm.user
WHERE ibm.user.name LIKE 'F%';
The `search_path`
SET search_path TO ibm;
SELECT id, name FROM user
WHERE user.name LIKE 'F%';
Django-tenant-schemas
Routing - middleware
# ...
connection.set_schema_to_public()
hostname = self.hostname_from_request(request)
TenantModel = get_tenant_model()
try:
tenant = self.get_tenant(TenantModel, hostname, request)
assert isinstance(tenant, TenantModel)
except TenantModel.DoesNotExist:
# ...
request.tenant = tenant
connection.set_tenant(request.tenant)
# ...
Routing - settings
MIDDLEWARE_CLASSES = [
'tenant_schemas.middleware.TenantMiddleware',
# 
]
DATABASES = {
'default': {
'ENGINE': 'tenant_schemas.postgresql_backend',
'NAME': 'mydb',
}
}
Routing - db backend
# ...
try:
cursor_for_search_path.execute(
'SET search_path = {0}'.format(','.join(search_paths)))
except (django.db.utils.DatabaseError, psycopg2.InternalError):
self.search_path_set = False
else:
self.search_path_set = True
if name:
cursor_for_search_path.close()
# ...
The Command Line
./manage.py tenant_command shell
./manage.py createsuperuser
./manage.py migrate_schemas
Querying
spinners = (
Spinner.objects
.annotate(
avg_duration=Avg('owned_spinners__spins__duration'))
.order_by('-avg_duration'))
SELECT id, duration FROM ibm.spinner_spin
WHERE duration > 120
UNION
SELECT id, duration FROM vinta.spinner_spin
WHERE duration > 120;
Querying across schemas
SELECT uuid, duration FROM ibm.spinner_spin
WHERE duration > 120
UNION
SELECT uuid, duration FROM vinta.spinner_spin
WHERE duration > 120;
Querying across schemas
Upsides
 Querying looks same as standard application
 New schemas created automatically
 Knows how to handle migrations
 Simpler infrastructure
Drawbacks
 Be carefull with too many schemas (maybe not more than 100's clients?)
 Tests need some setup and might get slower
 Harder to query across schemas
multitenancy is not
discrete, it is a
continuous spectrum
bit.ly/django-multitenancy
github.com/filipeximenes/multitenancy
Obrigado!
http://bit.ly/vinta2017
Newsletter:
vinta.com.br/blog/
twitter.com/@xima
github.com/filipeximenes
ximenes@vinta.com.br

More Related Content

Multitenant applications: How and Why