Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

SAS best Practices in Dataware house, Summaries of Computer Science

SAS best Practices in Dataware house and how to migrate to cloud

Typology: Summaries

2022/2023

Uploaded on 01/01/2023

raviaysola
raviaysola 🇺🇸

1 document

1 / 36

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
AMS Tech Summit.
SAS Teradata Integration,
Benefits & Cloud Overview
Bob Matsey & Paul Segal Senior Analytic Consultant
August 2022
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24

Partial preview of the text

Download SAS best Practices in Dataware house and more Summaries Computer Science in PDF only on Docsity!

AMS Tech Summit.

SAS Teradata Integration,

Benefits & Cloud Overview

Bob Matsey & Paul Segal – Senior Analytic Consultant

August 2022

2 ©2022 Teradata

  • Did You Know about SAS Usage
  • SAS Partnership Offerings
  • SAS & Teradata Partnership
  • SAS In Database Capabilities
    • Benefits of In Database
    • Coding Example
  • Customer Improvement Examples
  • VIYA Integration with Teradata
  • SAS & Vantage in the Cloud – Best Practices
  • Agile Analytics with Data Lab
  • Questions Agenda

4 ©2022 Teradata

The SAS & Teradata Partnership Overview Teradata is an Authorized Global Reseller of SAS Solutions Partnership began in 2007 to improving analytic performance Focus on joint product collaboration and customer success More than 550+ sales to over 600+ customers already *** Teradata has dedicated R&D teams onsite at SAS *** Regular collaboration on Joint Product Roadmap to ensure seamless product integration

5 ©2022 Teradata

Did You Know what the SAS Team is Working On? ▪ Work at Discover – SAS is using 33% of the CPU, and 21% of I/O ▪ Old process for Scoring was 7 days, New In DB with TD & SAS Scoring Accelerator was 36 minutes ▪ Work at Delta – SAS usage is 30% of the CPU, and 59% of all Queries ▪ Work at Amex - SAS is using 38% of the CPU, and 48.85 % of all Queries ▪ Work at Bell Canada – SAS is using 66% of the CPU, and 77% of all the Queries ▪ Implementing all SAS Viya RISK Applications to run on Teradata Vantage in the Cloud ▪ SAS Credit Scoring ▪ SAS Risk Modeling ▪ SAS Model Risk Management ▪ SAS Risk Engine ▪ Created a new Cloud Migration Program with Accenture & SAS – Rolling out now! ▪ Implementing for all Verticals ▪ More about: Integrating all SAS Risk Applications to run on Viya with Teradata in the Cloud ▪ No Other database vendor has this capability ( No Snowflake, No Redshift, No Synapse, integration!) ▪ A current example of Joint R&D development engagement as a continuous Partnership with SAS for last 15 years! ▪ SAS RISK PRODUCTS COMING IN Q4/Q! ▪ SAS Solution for CECIL ▪ SAS Solution for CCAR ▪ More to come – Financial Crimes Applications next ( Fraud, AML, etc.)

7 ©2022 Teradata

AMEX

8 ©2022 Teradata

10 ©2022 Teradata

4 Ways the SAS & Teradata Integration Drives Customer Value Translating analytic speed, availability and performance into increased corporate revenue Time to Develop Model Value Creation Additional Value Created Current Process Improved Process Reducing Model Development Time

Faster analytic systems allow users to build, test and implement new

models more quickly, creating additional value for the organization

Improved Model Performance

Being able to do more model testing and to update existing models to

achieve optimal performance can add significant value over time

Additional Value Created Per Year Month 2 Month 4 Month 6 Month 8 Month 10 Month 12 $500k $400k $300k $200K $100k $ Improved Models Existing Models Improving Workforce Productivity

Increasing the productivity of your existing data scientists reduces the

need for additional headcount as well as improves model performance

Scoring Models to All the Data

Increase the value achieved from each model by eliminating

unnecessary limitations to its performance due to the sampling data

Database

Value Produced

$1 Million

Value Produced

$4 Million

11 ©2022 Teradata

SAS & Teradata Partnership Offerings How we can help get started! SAS - Teradata Assessment Workshop (FREE!) SAS Teradata Coding Training Integration Skills Training (IST) Optimization Service Offering Data Lab Presales & Implementation Services SAS - Teradata Roadshow (One on one sessions with customers) Cloud Migration Program with Accenture & SAS Offering

13 ©2022 Teradata

Example of In Database with Proc FREQ

  • Request all rows
  • Select state, credit from credit data;
  • Calculate frequency count Traditional Technique SQL Pushdown
  • Select count(*), state, credit from…

group by state, credit;

  • Return only count

SQL

Select

Traditional SQL Pushdown Rows Returned (^) 9,000,000 51 Time to Process (^) 55 seconds 2 seconds

SQL

Select

SAS® Session

Proc Freq;

table state*credit;

SAS/Access to Teradata Node Amps Node Amps Node Amps Node Amps Node Amps Node Amps Node Amps Node Amps Node Amps Node Amps Node Amps Node Amps Node Amps Node Amps Node Amps Node Amps

14 ©2022 Teradata

In Database Coding Example Testing In-database Functionality Not Running In Database Example: ( SQLGENERATION=NONE;) will tell the code to NOT run In database.

Example 1 – Shows running a simple Proc Freq in a SAS program against a larger dataset ( at least 1- 2 million rows) without in-database

capabilities turned on & with SAS log turned on. Then review the SAS log for duration and database performance

Code Example :

12 libname tdXXXX teradata server="XXXserver" database=XXXXP user=&user password=&password; 13 17 options sastrace=(,,,ds) sastraceloc=saslog nostsuffix; 20 OPTIONS SQLGENERATION=NONE; 21 PROC FREQ DATA=tdxxxx.xxxxx; 22 TABLES XXXX_XXXX; 23 RUN; Running In Database Example: ( SQLGENRATION=DBMS; ) will tell the code to run In database

Example 2 – Running the same Proc Freq code in a SAS program with the following options: options SQLGENERATION=DBMS.

This option says to run the code In database whenever it can, so I highly recommend putting this on ALL your SAS code.

Code Example :

12 libname tdXXXX teradata server="XXXserver" database=XXXXP user=&user password=&password; 13 17 options sastrace=(,,,ds) sastraceloc=saslog nostsuffix; 20 OPTIONS SQLGENERATION=DBMS DBIDIRECTEXEC; 21 PROC FREQ DATA=tdxxxx.xxxxx; 22 TABLES XXXX_XXXX; 23 RUN;

Running these two test will show,

Example 1 – this will NOT run IN database.

Example 2 – will run IN database.

16 ©2022 Teradata

In-Database Functionality

  • PROC APPEND
  • PROC CONTENTS
  • PROC COPY
  • PROC DATASETS
  • PROC DELETE
  • PROC FORMAT
  • PROC FREQ
  • PROC MEANS
  • PROC PRINT
  • PROC RANK
  • PROC REPORT
  • PROC SORT
  • PROC SQL
  • PROC SUMMARY
  • PROC TABULATE Statistical Analysis Procedures: SAS Enterprise Miner: Base Procedures :
  • PROC CANCORR
  • PROC CORR
  • PROC FACTOR
  • PROC PRINCOMP
  • PROC REG
  • PROC SCORE
  • PROC TIMESERIES
  • PROC VARCLUS
  • PROC DMDB
  • PROC DMINE
  • PROC DMREG (Logistic Regression)
  • Also nodes for Input, Sample, Partition, Filter, Merge, Expand SAS Analytics for Teradata
  • PROC ACECLUS
  • PROC CALIS
  • PROC CANDISC
  • PROC DISCRIM
  • PROC FACTOR
  • PROC PRINCOMP
  • PROC TCALIS
  • PROC VARCLUS
  • PROC ORTHOREG
  • PROC QUANTREG
  • PROC REG
  • PROC ROBUSTREG
  • Match code
  • Parsing/Casing
  • Gender/Pattern/Identification analysis
  • Standardization SAS/Access to Teradata - PROC DS SAS Code Accelerator for Teradata SAS Scoring Accelerator for Teradata - EM/STAT* Models DQ Accelerator for Teradata PROC SCORE works with coefficients from:

17 ©2022 Teradata

Customer Case Study Global Telecom Firm Initial findings:

  • Widespread data marts
  • Multiple databases
  • Unnecessary ETL jobs
  • Major support and governance issues
  • Complexity on the rise
  • Shadow IT
  • High cost ODS ODS Smart Mart EDW mart Finance Marketing Product/ Services Executive mart mart mart mart OPSYS1 OPSYS2^ ERP^ OPSYS3^ OPSYS

19 ©2022 Teradata

# Process Name SAS + Oracle SAS + 2 Node Teradata X Faster 1 Horizontalization – Reduction Process 18 hrs 7 mins 32 mins 34 X 2 Horizontalization – 2nd 15 hrs 3 mins 33 mins 27 X 3 Variable Calculation 6 hrs 57 mins 4 mins 104 X 4 Risk Scoring^ 10 hrs 56 mins 11 mins (^) 60 X 5 Data Mart Generation^ 27 hrs 50 mins 1 hour 28 mins 19 X

20 ©2022 Teradata

SAS Viya & Vantage Integration