-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added OED validation on file upload, and updated ods-tools package to 3.0.1 #724
Conversation
|
||
# Load DataFrame and pass to ods-tools exposure class | ||
exposure = OedExposure(**{ | ||
EXPOSURE_ARGS[field]: pd.read_csv(io.BytesIO(f.file.read())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of wish we could just put
EXPOSURE_ARGS[field]: f.file
what type is f.file.
I may be able to add it to ods_tools supported format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having an option to read from a binary stream (like io.BytesIO
) would be handy, I think that task should belong with ods-tools.
Currently the platform loads the bytes into pandas DataFrame before passing over to the Exposure class, which is basically a work around.
The f.file
object is from Django, and its a wrapper around the file storage classed loaded in Django's settings.
For example, it could be a file from disk, S3 Object or Azure Blob... etc
> f
<RelatedFile: File_d9542031640d42e2b7f27560be2290d5.csv>
> f.file
<FieldFile: d9542031640d42e2b7f27560be2290d5.csv>
> type(f.file)
<class 'django.db.models.fields.files.FieldFile'>
-
RelatedFile
is from here https://github.com/OasisLMF/OasisPlatform/blob/develop/src/server/oasisapi/files/models.py#L57-L69 -
FieldFile
is from Django Storage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it'd be nice if ods_tools
supported this natively.
For this PR on Platform, I approve the current implementation with io.BytesIO
workaround - we'll update if/when we update the ods_tools
API.
Looking at OedExposure
, I see there is an OedSource.from_file_path()
method that is triggered if the file is a path, but from_file_path
only returns an initialised OedSource
without actually reading the data from file.
Since OedSource.from_dataframe()
loads the data, I think we should just have OedSource.from_file_path()
to load the data before returning.
@sstruzik can we do that or perhaps open an issue on the ods repo to track this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice new feature.
Minor fixes requested before approval.
You may want to update the PR title and wherever it makes sense to reflect that you're upgrading directly to ods_tools 3.0.1, just to avoid confusion when looking this up in the future.
Thanks!
|
||
# Load DataFrame and pass to ods-tools exposure class | ||
exposure = OedExposure(**{ | ||
EXPOSURE_ARGS[field]: pd.read_csv(io.BytesIO(f.file.read())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it'd be nice if ods_tools
supported this natively.
For this PR on Platform, I approve the current implementation with io.BytesIO
workaround - we'll update if/when we update the ods_tools
API.
Looking at OedExposure
, I see there is an OedSource.from_file_path()
method that is triggered if the file is a path, but from_file_path
only returns an initialised OedSource
without actually reading the data from file.
Since OedSource.from_dataframe()
loads the data, I think we should just have OedSource.from_file_path()
to load the data before returning.
@sstruzik can we do that or perhaps open an issue on the ods repo to track this?
* Add code-quality.yml * Always run both code scans on fail * Run Autopep8 * missed some autopep * Remove unused imports * Fix Autopep8 from #724
… 3.0.1 (#724) * test file return format conv This reverts commit 8c0de7e. * Draft option to validate oed files on upload * validate on upload - wip * Note for later * Draft serializer to return portolio validation status * WIP GET validated files * POST portfolio validate * Add validation url param * read validation option, url/settings.py * Set ods-tools 3.0.1 * fix unittests * Fix validation on parquet file uploads * Fix handling of invalid data upload * Fix typos and missing docs strings * remove dup func * Turn valadation on by default * Add validation unit testing * Add test_all_exposure__are_valid * set set_portolio_valid as portfolio method instead of function * Move VALIDATION_CONFIG to settings.py * Update Swagger to show validate on upload default=true
* Add code-quality.yml * Always run both code scans on fail * Run Autopep8 * missed some autopep * Remove unused imports * Fix Autopep8 from #724
… 3.0.1 (#724) * test file return format conv This reverts commit 8c0de7e. * Draft option to validate oed files on upload * validate on upload - wip * Note for later * Draft serializer to return portolio validation status * WIP GET validated files * POST portfolio validate * Add validation url param * read validation option, url/settings.py * Set ods-tools 3.0.1 * fix unittests * Fix validation on parquet file uploads * Fix handling of invalid data upload * Fix typos and missing docs strings * remove dup func * Turn valadation on by default * Add validation unit testing * Add test_all_exposure__are_valid * set set_portolio_valid as portfolio method instead of function * Move VALIDATION_CONFIG to settings.py * Update Swagger to show validate on upload default=true
* fix loading location data in worker * replace package updates * Update server - python requ and ods-tools>3.0.1 * Added OED validation on file upload, and updated ods-tools package to 3.0.1 (#724) * test file return format conv This reverts commit 8c0de7e. * Draft option to validate oed files on upload * validate on upload - wip * Note for later * Draft serializer to return portolio validation status * WIP GET validated files * POST portfolio validate * Add validation url param * read validation option, url/settings.py * Set ods-tools 3.0.1 * fix unittests * Fix validation on parquet file uploads * Fix handling of invalid data upload * Fix typos and missing docs strings * remove dup func * Turn valadation on by default * Add validation unit testing * Add test_all_exposure__are_valid * set set_portolio_valid as portfolio method instead of function * Move VALIDATION_CONFIG to settings.py * Update Swagger to show validate on upload default=true * Add migrations + fix swagger schema * Fix pep8 and flake issues * Fix issues with merge * flake8 * Fix location file loading for oasislmf==1.27 * Fix
* Fix/CVE issues plat 2 (#648) * Test trivy * Updt * Updt * python CVE * Updt * Updt * User mapped docker * Update requ and pip oasislmf to latest LTS * Update unittests to python3.10 + add user mapping * Fix deps * Update images to 22.04 * Update unittest base image * Clean up older docker files * Docker user map for schema tester * Fix unittests * Fix unittests * Fix unittests * fix unittests * Run containers as non root * Update pandas * Fix build error installed cffi * Non-root containers -- wip not building yet * Lock down server image, no package installs with "server" user * Fix pyarrow install error * Fix worker build + lockdown * Fix * lockdown worker-controller * Update docker-compose * Fix $PATH * Fix celery error * Fix file access error * fix * Fix perms * Fix Azure server crash * Fix Azure server crash * Fix root access for ONLY updating the hosts file -- azure crash issue * Set store file w/o setting permistions * Fix writes to root mounted fs by non-root accounts * fix * Update .dockerignore * Fix twisted * Fix wasted space * Fix * Fix unittest docker runner * Fix arti storage * test fix * Try again * Feature/main branch ports (#651) * Feature/1018 default samples (#620) * Add pkg-config for ktools build * Load model_settings on run analysis - for default samples * Update analysis schema * Update model schema * Feature/608 settings templates (#621) * First pass Swapped layout for nested resources Fix serializers * Fix routers and added description field * Update requirments * Rename filename to name * Update deploy script * Update tester * Fix/list views order (#653) * Option 1 - force order in VerifyGroupAccessModelViewSet * Option 2 - Force the order via the model Meta field * try 2 * Revert "Option 1 - force order in VerifyGroupAccessModelViewSet" This reverts commit 883ba3c. * remove comments * Feature/filter options subtasks (#655) * Add filterset for subtasks, and param fiters on analysis "sub_task_list" * Fix template view * Fix template view * Fixed? * Fix slug param and tidy * Fix * Fix/631 autoscale non zero (#656) * Fix boolean arg loading * Remove debug prints * Fixes for work sub-tasks (#657) * Draft task context logger * Update compose for debug * Draft task log storage 2 * Clean up task log storage * Create dir in workers for task logging "/var/log/oasis/tasks" * Create timestamps before response queued * Fix model_settings_json to abs path * Add verbose logging * Fix/keycloak auth proxy (#659) * test X-Forwarded-For * Fix parse error * clean up * Set proxy idle timeout to 2h * Set keepalive options * Final? * Skip secret check in image scans * Update gen scan * Fix/loclines lessthan chunks (#660) * Add check for loc lines * Fix error msg * Fix unittests * Fix/646 websocket error handling (#662) * Test workaround from GH issues * Test worker-controller more updates * Pin Oasislmf for output fix & fix logging * Run update requirments as root * Feature/pre analysis hook (#663) * Basic pre-analysis hook * tidy * Fix loading * Fix portfolio file removal * Clean up tmp pre-analysis-hook exposure files on failed run * Update oasislmf package pin * Fix/output kat chunking (#665) * Fix output kats and append logs to sub-task errors * Add max chunks option * Migrations * Tidy up * Pin new oasislmf fix * Add max chunks to controller * rm debug trace * Fix for gulpy * own conf.ini * own var for jba * Fix large file download/upload timeout (#685) * Scrap all changes and lay groundwork for two server containers Switch socket port to 8001 Add uwsgi to requ switch default command to wsgi Revert "Add uwsgi to requ" This reverts commit ae7f108. Install uwsgi as binary package Revert "Revert "Add uwsgi to requ"" This reverts commit 2f82ddf. Add wsgi Revert "Revert "Revert "Add uwsgi to requ""" This reverts commit c43f2b3. try gunicorn fix routing Increase timeout and match worker count to system cores Increased timeout fixed download test charts Fix API azure access Fixed DNS routing to websocket container Fix token access issue? Fix sub_path enable/disable Disable "/api/ sub-path for internal web-socket server" websocket server has no connection to keycloak for auth, try using the current oasisServer options Fix? fix ugh shot in the dark Remove added client use keycloak settings from server values * Add option for ssl redis (#687) * Add option for ssl redis * Update configMaps * syntax fix * Add missing ssl data in configMap * FIx ssl data in ConfigMap * When running with REST api in Subpath, also work with root pathing * Fix/auth timeout (#697) * Test updated settings * bump UI ver * Revert "Test updated settings" This reverts commit e5b6b3c. * Increase timeout to 1d * Hot fix for exposure file content type "application/vnd.ms-excel" (#700) * Update/CVE versions (#708) * Update packages with CVE issues Update worker-controller fix joblib and oauthlib fix django django-celery-results pandas ruamel.yaml parso and distlib SQLalc Updated package virtualenv Updated package filelock Updated package azure-storage-blob Updated package coverage Updated package django-request-logging Updated package drf-yasg Updated package numpy Updated package scipy Updated package waitress Updated package ods-tools Updated package psycopg2-binary Updated package scikit-learn Add update requ by package script Updated package oasislmf Updated package celery==5.* Updated package numpy * Fix worker-controller crash with py3.10 * Update model settings schema * Update to RC of oasislmf * Revert "Update model settings schema" This reverts commit b51e1a2. * Revert "Update to RC of oasislmf" This reverts commit 3eb9dad. * Base case, update keycloak to most recent v15 - CVE (102) 32 High, 94 fixable * Always include ssl in configmap * Always include ssl in configmap (#719) * Add support for registry credentials * Remove redundant range * Collect extra lookup files (#731) * Keys file merge Fix parquet Fix file merging * Update file merge * Fixes for Github actions - Platform 2.x (#717) * Github actions CI/CD update (#711) * Add working notes on docker image caching * Add placeholder files * test sarif report * Fix fmt * needs repo * f * scan all requ files * report update * fix report rename * complete repo / python scanning * complete repo / python scanning * Fix URL append * Fix URL append take2 * unit test and report with Junit * apt install sudo * Add report file * Fix report path * try another formatter * Switch to python setup * fix * where is report * f * just tox * switch to py3.10 * switch to py3.10 * Test build matrix * Fix output? * Create new file for piwind build * remove piwind * Ignore scan errors on debian image * Fix image ref * f * Add schema builder * typo fix * Fix missing dir * fix * time image size * time image size * fix * fix again * Add version workflow * Move and update helper scripts * missed params in img check * Test images for wasted space * Fix bash? * fix * Add version fixes * Draft script to clean-up workflow runs * tmp * outline for release workflow * Test workflow connections * Add missing triggers * Fix booleans * test workaround * test workaround * try 1/0 for bool * How are booleans this broken in GH actions * Update schema test script loc * fix missing "/" * Fix mount location * print needs context * no easy way to output from matrix of jobs, missing ref, create image names externally * try from env * no dice, env not in context * ffs .. * Fix schema scan * where is the build output image names? * Try unconnected output image names * ????!? * Give up.. hardcode output vars * Give up.. hardcode output vars * fix docker push * disable push and force retest * Force retest * try build and push for plat * force retest * re-test * Sorry for all the spam Matt ~ force retest * Add release checks and image re-tags * Fix align * Fix syntax * Test Release script * fix * re-test * re-test * fix tags * fix tags * fix branchs * fix prev-release find steps * fix? * Test find release script * Try running bash from script * weird pathing, where is the script? * Fix dir overwrite * try again * Fix for ktools tags which start with "v" * Add docker push * Fix script arg * forece retest * forece retest * retest * again * Fix schmea and version checks * Fix test branch + content type * why no match? * fix ver checks * Fix oasislmf version check * Add overrides to auto-fetch the correct current/prev versions * fix ifs * fix * Fix create Release notes * Move changelog script to OasisPlatform * Add missing requ file * Fix changelog script call * Plz just publish * Fix bad path var * Fix release path * Tidy & force retest * Add image summary to build * Fix summary * Ready publish script for merge * dont push "latest" tags on backport branches * Draft image testing * Test piwind matrix testing * fix * image build - fix checkout on PR * fix * Add option to skip scan image on build * Test s3 piwind run from Platform * Fix? * test - set correct piwind branch * force retest * Retest with wait for portfolio * If scan skipped also skip image size testing * try only s3 * test * Try tests without matrix * Update stored results/logs + add extra tests * Update test-piwind.yml * Try smaller test * Skip S3 check * Skip S3 check * Test with updated wait fixture * Fix worker check * Is the hang from debug mode? * update * Run in parallel * Tidy up * Update name * Add piwind branch select * fix script branch * Add missing branch output * remove placeholder files * Update badges * tidy names * Add PR / push triggers * Add manual trigger to image tests * Fix env context error * fix syntax * Update trigger branches and enable CVE error on hit * Fix scan inputs * Update github workflows with versions from c8542c9 * Run Autopep8 * Fix unused imports - Flake8 * Collect extra lookup files (#731) * Keys file merge Fix parquet Fix file merging * Update file merge * Fix code-ql post merge * Switch branch targets * Fix unit testing * Fix find prev release * Test keycloak and worker-controller scan * Fix step name * fix for trivy issue - aquasecurity/trivy#3514 * Revert "fix for trivy issue - aquasecurity/trivy#3514" This reverts commit 21afc42. * Wait for trivy bug fix * Fix test_analysis_model.py * Fix missing import + pep * Add worker controller to build * Add publish for worker controller * Test plat2 compose in piwind * Fix test images * Try keycloak scan again * Add workflow for scanning external images * fix * Fix external img scanning * Fix worker controller build * Fix missing mapping * Switch to fail-fast false * More fixes * Add retry for failed model reg tasks * Revert "Add retry for failed model reg tasks" This reverts commit ebecb93. * Retry worker reg, -- dont dup config * Add workaround to align platform-2.0 branches with piwind * Fix merge errors * Fix missing piwind branch output * OED fixes for platform2 (#744) * fix loading location data in worker * replace package updates * Update server - python requ and ods-tools>3.0.1 * Added OED validation on file upload, and updated ods-tools package to 3.0.1 (#724) * test file return format conv This reverts commit 8c0de7e. * Draft option to validate oed files on upload * validate on upload - wip * Note for later * Draft serializer to return portolio validation status * WIP GET validated files * POST portfolio validate * Add validation url param * read validation option, url/settings.py * Set ods-tools 3.0.1 * fix unittests * Fix validation on parquet file uploads * Fix handling of invalid data upload * Fix typos and missing docs strings * remove dup func * Turn valadation on by default * Add validation unit testing * Add test_all_exposure__are_valid * set set_portolio_valid as portfolio method instead of function * Move VALIDATION_CONFIG to settings.py * Update Swagger to show validate on upload default=true * Add migrations + fix swagger schema * Fix pep8 and flake issues * Fix issues with merge * flake8 * Fix location file loading for oasislmf==1.27 * Fix * CVE scans are raising issues but Actions marked as passed (#746) * Is it fail fast? * Revert "Is it fail fast?" This reverts commit b7fceae. * is output report masking error? * Run Trivy twice, once for error status, then after to log all issues * Log all unfixed issues * Fix * Same for platform scanning * Test cache dir * cache dosnt speed up 2nd run * Only fail crit or high * Keep a copy of requirments in final image (needed for scanning) * No need to run seveal repo scans, issues should be picked up at image build * Update/oasislmf 1.27 (#745) * tidy up workflows * Set worker to use oasislmf 1.27 * Set debug=1 * Fix eltcalc output * Run all piwind tests * Fix loading default - PORTFOLIO_UPLOAD_VALIDATION * Fix test_portfolio.py * Update kube-prometheus-stack (#748) * Try switching kube-prometheus-stack * Try 44.3.1 * Update helm dependency build + fix values * Add draft workflow of helm * update * Update notes * Set ods-tools 3.0.2 (#753) * Set ods-tools 3.0.2 * Dont use ods-tools for format convention If OED exposure is passed though ods-tools extra col will be added which returns different data based on the base format Example: POST: CSV file -> portfolios/{n}/location_file GET: portfolios/{n}/location_file?format=csv (no blank required col inserted) GET: portfolios/{n}/location_file?format=parquet (any missing TIV cols will be added before data is return) Fix: Only add extra columns if file is validated * Remove unsed import * RiskLevel allowed blank, remove error from test * fix unused import * Update/keycloak (#709) * test latest * Set keycloak to 19.0.3-legacy * Add missing placeholder * Try 20.0.3 * Fix keycloak config for v20 test Fix auth path and admin user clear old files clean templates/keycloak.yaml * Fix openid - only works for "./manage runserver" tmp Try fresh export Try adding scope into requests Revert "Try adding scope into requests" This reverts commit 0c4abb54530e41095e4c15eaabcc42ffcab550e2. Revert "Revert "Try adding scope into requests"" This reverts commit 0ca52c7a1eea1aef37e5ed3e7ce3726e402858ba. * Try fix for remapping keycloak host external -> internal * Fix dockerfile * pep8 fix * Delete backup oasis-realm * Update external image versions (#757) * Show misconfig but dont fail CI * scan every 6h * Fix cryptography * Only create image CVE if exitcode is 1 * Try latest external images * test hack to set PV permisstions * Update name * Delete, recreate oasis-server client, and update realm export * Limit external image scanning to vuln * Fix flanky token request Token requests work with a fresh access token but fail on refresh due to internal -> external mapping missing from 'src/server/oasisapi/auth/serializers.py' ``` File "/home/server/.local/lib/python3.10/site-packages/django/core/handlers/exception.py", line 47, in inner response = get_response(request) File "/home/server/.local/lib/python3.10/site-packages/django/core/handlers/base.py", line 181, in _get_response response = wrapped_callback(request, *callback_args, **callback_kwargs) File "/home/server/.local/lib/python3.10/site-packages/django/views/decorators/csrf.py", line 54, in wrapped_view return view_func(*args, **kwargs) File "/home/server/.local/lib/python3.10/site-packages/django/views/generic/base.py", line 70, in view return self.dispatch(request, *args, **kwargs) File "/home/server/.local/lib/python3.10/site-packages/rest_framework/views.py", line 509, in dispatch response = self.handle_exception(exc) File "/home/server/.local/lib/python3.10/site-packages/rest_framework/views.py", line 469, in handle_exception self.raise_uncaught_exception(exc) File "/home/server/.local/lib/python3.10/site-packages/rest_framework/views.py", line 480, in raise_uncaught_exception raise exc File "/home/server/.local/lib/python3.10/site-packages/rest_framework/views.py", line 506, in dispatch response = handler(request, *args, **kwargs) File "/var/www/oasis/src/server/oasisapi/auth/views.py", line 44, in post return super().post(request, *args, **kwargs) File "/home/server/.local/lib/python3.10/site-packages/rest_framework_simplejwt/views.py", line 43, in post serializer.is_valid(raise_exception=True) File "/home/server/.local/lib/python3.10/site-packages/rest_framework/serializers.py", line 227, in is_valid self._validated_data = self.run_validation(self.initial_data) File "/home/server/.local/lib/python3.10/site-packages/rest_framework/serializers.py", line 429, in run_validation value = self.validate(value) File "/var/www/oasis/src/server/oasisapi/auth/serializers.py", line 73, in validate response = requests.post( File "/home/server/.local/lib/python3.10/site-packages/requests/api.py", line 115, in post return request("post", url, data=data, json=json, **kwargs) File "/home/server/.local/lib/python3.10/site-packages/requests/api.py", line 59, in request return session.request(method=method, url=url, **kwargs) File "/home/server/.local/lib/python3.10/site-packages/requests/sessions.py", line 587, in request resp = self.send(prep, **send_kwargs) File "/home/server/.local/lib/python3.10/site-packages/requests/sessions.py", line 701, in send r = adapter.send(request, **kwargs) File "/home/server/.local/lib/python3.10/site-packages/requests/adapters.py", line 565, in send raise ConnectionError(e, request=request) Exception Type: ConnectionError at /access_token/ Exception Value: HTTPSConnectionPool(host='ui.oasis.local', port=443): Max retries exceeded with url: /auth/realms/oasis/protocol/openid-connect/token (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa9bb239db0>: Failed to establish a new connection: [Errno -2] Name or service not known')) ``` * Fix missing severity * Skip mysql, since postgres is default * PEP * Update python packages * Fix * Fixes for merging of distributed input files (#758) * Fixes for merging of distributed input files * Simplify condition * Lint * Fix settings file file loading for custom lookup (#670) * remove "lookup_complex_config_json" param if no settings file given * ++ version num * Fix pep * Align with branch backports/1.27.x (#762) * Fix schema build workflow (#764) * Fix schmea build * Fix missing dir * Set version 2.1.0 * Dont use latest tag for v2 images, avoid problems with current platform * Update changelog --------- Co-authored-by: Carl Fischer <carl.fischer@jbarisk.com> Co-authored-by: awsbuild <awsbuild@oasislmf.org>
Fixed Parquet storage feature with ods-tools==3.0.0
Integrated OED validation to portfolios
PORTFOLIO_UPLOAD_VALIDATION=<true/false>
This sets a default option which runs OED validation on files uploaded to a portfolio.If a file fails validation it will return a HTTP 400 - Bad Request response containing the validation errors.
/v1/portfolios/{id}/location_file/
/v1/portfolios/{id}/accounts_file/
/v1/portfolios/{id}/reinsurance_info_file/
/v1/portfolios/{id}/reinsurance_scope_file/
?validate=<true/false>
Example JSON return:
csv
andparquet