Slideshow transcript
Slide 1: Pro PostgreSQL Robert Treat omniti.com brighterlamp.org
Slide 2: O-0 Who Am I? (Why Listen To Me) PostgreSQL User Since 6.5.x DBA of High Traffic / Large PostgreSQL Instances Long Time Contributor to PostgreSQL Project Contribute / Maintain Several Open Source Projects Co-Author Beginning PHP & PostgreSQL 8 (Apress)
Slide 3: O-1 Outline What you need to know Availability about the project Scalability Getting started Query tuning Upgrading Tablespaces Configuring your server Partitioning Hardware Stuff you should know about
Slide 4: K-0 Know Your Way Around The Project
Slide 5: K-1 Know Your Way Around The Project www.postgresql.org downloads security alerts documentation wiki bug reports support companies rss –> news – events versions
Slide 6: K-2 Know Your Way Around The Project www.pgfoundry.org projects.postgresql.org URI Type CIText Modules SkyTools Programs Npgsql Pl/Proxy Resources pg_bulkload plpgsqldebugger sample databases
Slide 7: K-3 Know Your Way Around The Project www.planetpostgresql.org Project News Community News Helpful Tips / Examples
Slide 8: K-4 Know Your Way Around The Project archives.postgresql.org mailing list archives back to 1997 full text search via postgtresql 8.3 keyword search suggestions lists for users, developers, regional, user groups
Slide 9: K-5 Know Your Way Around The Project #postgresql irc.freenode.net real time help rtfm_please ??help
Slide 10: K-6 Know Your Way Around The Project project management core team committers hackers roadmap web team
Slide 11: S-0 Get Off To A Good Start
Slide 12: S-1 Get Off To A Good Start Use package management Consistent Standardized Simple
Slide 13: S-2 Get Off To A Good Start Use package management Different across systems Upgrades are an issue Trust your packager?
Slide 14: S-2 Get Off To A Good Start Use package management Different across systems Upgrades are an issue Trust your packager? Don't Be Afraid To Roll Your Own
Slide 15: S-4 Get Off To A Good Start Configure Logging $PGDATA/pg_log /var/log/pgsql when in doubt... (postgresql.conf) separate disk Logging is often overlooked, but is the first step toward troubleshooting!
Slide 16: S-5 Get Off To A Good Start Configure Authentication most systems have different defaults firewalls/ selinux (FATAL) rtfm (pg_hba.conf, grant, revoke)
Slide 17: S-6 Get Off To A Good Start Authentication Methods TRUST md5 IDENT
Slide 18: S-7 Get Off To A Good Start /contrib trust these more than your own code package dependent use different schemas (when able) pgcrypto pgstatstuple, pg_buffercache, pg_freespacemap
Slide 19: S-8 Get Off To A Good Start procedural languages package dependent some are noncore (plruby, plr, plphp) varying functionality varying levels of trust don't be afraid, test!
Slide 20: U-0 Let's Talk About Upgrades
Slide 21: U-1 Let's Talk About Upgrades Versioning First Digit (7.4.16 > 8.2.0) Second Digit (8.2.4 > 8.3.0) Third Digit (8.3.0 > 8.3.1)
Slide 22: U-2 Let's Talk About Upgrades Versioning First Digit (7.4.16 > 8.2.0) Second Digit (8.2.4 > 8.3.0) Third Digit (8.3.0 > 8.3.1)
Slide 23: U-3 Let's Talk About Upgrades Versioning First Digit (7.4.16 > 8.2.0) Second Digit (8.2.4 > 8.3.0) Third Digit (8.3.0 > 8.3.1)
Slide 24: U-4 Let's Talk About Upgrades Achtung!! Make Backups! Read the Release Notes!
Slide 25: U-5 Let's Talk About Upgrades pg_dump/pg_restore simple Fc is your friend dump with new version of pg_dump pitfalls (time, hdd)
Slide 26: U-6 Let's Talk About Upgrades the slony method not simple create slave on new version switchover (switch back?) pitfalls (initial synch, compatibility)
Slide 27: U-7 Let's Talk About Upgrades pg_migrator in place upgrades rewrites system catalog info no way to go back (fs snapshots) still new, underflux 8.1 > 8.2 only (for now)
Slide 28: U-8 Let's Talk About Upgrades upgrading older db <= 7.3 is no longer supported (upgrade now!) pg_dump 8.2 has issues with <= 7.2 you can upgrade to 7.3 first use adddepends on 7.3 install slony requires 7.3 (or 7.4) (or 8.3) or newer pg_migrator (lol)
Slide 29: C-0 Figure Your Configure
Slide 30: C-1 Figure Your Configure the basics : performance effective_cache_size shared_buffers default_statistics_target work_mem checkpoint_segments
Slide 31: C-2 Figure Your Configure the basics : logging stderr/pg_log vs. syslog/eventlog log_min_error_statement (error!) log_min_duration_statement log_line_prefix (%d, %p, %t)
Slide 32: C-3 Figure Your Configure other stuff worth looking at maintenance_work_mem max_prepared_transactions update_process_title max_fsm_pages synchronous_commit
Slide 33: M-0 Routine Maintenance
Slide 34: M-1 Routine Maintenance a word about vacuum reclaim usable space update table stats avoid xid wraparound
Slide 35: M-2 Routine Maintenance autovacuum : just do it! autovacuum track_activities track_counts autovacuum_max_freeze_age pg_autovacuum ?
Slide 36: M-3 Routine Maintenance other stuff worth looking at reindexing logfiles backups failover
Slide 37: P-0 Hardware For Software People
Slide 38: P-1 Hardware For Software People you can be cheap on the licensing, don't be cheap on the hardware Cram the (ecc) RAM Redundant Disks (Software|Hardware) RAID Prefer SCSI, Accept SATA RAID Z (JBOD)
Slide 39: P-2 Hardware For Software People cpu and you Multiple CPU's work wonders, up to 8 processors (16 OK) Opteron, Xeon, Core 2 Duo http://tweakers.net/
Slide 40: P-3 Hardware For Software People disk do's Put WAL on it's own disk (RAID 1) Put DATA directory on it's own disk (RAID 10) More Spindles is Good More Controllers Even Gooder (tablespaces!) Battery Backed Controllers, Write cache enabled
Slide 41: P-4 Hardware For Software People disk dont's NFS == Not Fully Safe! RAID 5 Beware disks / controllers that Lie
Slide 42: A-0 Availability (Not Scalability)
Slide 43: A-1 Availability what do we mean by availability? not backups (exactly) not replication (necessarily) not clustering (even less so)
Slide 44: A-2 Availability what do we mean by availability? not backups (exactly) not replication (necessarily) not clustering (even less so) if (kablooy) then (ok)
Slide 45: A-3 Availability pg_dump traditionally used for backups send dump to another server constantly run restore process large time, i/o constraints
Slide 46: A-4 Availability filesystem snapshots zfs, lvm built on top of pitr backup commands traditionally used for backups copy snapshot to another server cumbersome time, i/o constraints
Slide 47: A-5 Availability pitr create second, standby server ship wal logs to new server less time/io than pg_dump 8.1 > cold standby 8.2 > warm standby 8.4 > hot standby ?
Slide 48: A-6 Availability slony asynchronous, masterslave replication controlled switchover, failover low i/o, time constraints other benefits (upgrades, scaling)
Slide 49: A-7 Availability bucardo asynchronous, multimaster replication also does masterslave low i/o, time constraints other benefits (upgrades, scaling)
Slide 50: A-8 Availability shared disk one copy of PGDATA on shared storage standby takes over akin to db crash shared disk is point of failure (raid) STONITH
Slide 51: A-9 Availability filesystem replication drbd filesystem mirrored between servers synchronized, ordered writes single disk system?
Slide 52: A-10 Availability pgpool dualmaster, statement based little caveats (random(),now(),sequences) bigger caveats (security, password, pg_hba) pgpool becomes failure point
Slide 53: A-11 Availability postgresr multimaster, synchronous just open sourced this month! small community not proven
Slide 54: H-0 Scalability
Slide 55: H-1 Scalability what is scaling? “How well a solution to some problem will work when the relative size of the problem increases” Theo Schlossnagle
Slide 56: H-2 Scalability bigger, better, faster, more! postgresql scales up pretty well more disks (tablespaces) more cpu's, more ram connection pooling 1000+ connections, TB+ data
Slide 57: H-3 Scalability pgpool dualmaster, statement based little caveats (random(),now(),sequences) bigger caveats (security, password, pg_hba) pgpool becomes failure point
Slide 58: H-4 Scalability pg_bouncer simple connection pooler 10/1 > 40/1 caveats (prepared statements, temp tables) skype, myyearbook.com
Slide 59: H-5 Scalability slony asynchronous, masterslave replication multiple, cascading slaves scales read operations other benefits (upgrades, scaling) solid user base
Slide 60: H-6 Scalability bucardo asynchronous, multimaster replication also does masterslave low i/o, time constraints other benefits (upgrades, scaling)
Slide 61: H-7 Scalability pgpoolII single db over multiple machines scales read operations replication, load balance, parallel query green technology
Slide 62: H-8 Scalability pgcluster synchronous multimaster replication significant complexity scales read operations other uses (failover abilities) green technology
Slide 63: H-9 Scalability postgresr multimaster, synchronous just open source this month! small community other uses (failover abilities) not proven
Slide 64: H-10 Scalability pitr readonly slaves based on pitr, warm standby operation core team officially supporting development 8.4 > synchronous wal shipping 8.? > read only slaves
Slide 65: J-0 Query Your Queries
Slide 66: J-1 Query Your Queries finding slow queries: log_min_duration_statement 1, 0 , n superuser only alter user LOG: duration: 5005.273 ms statement: select pg_sleep(5);
Slide 67: J-2 Query Your Queries finding slow queries: pgfouine / pqa log analyzers command line, generate reports i/o load http://pgfouine.projects.postgresql.org/reports.html http://pqa.projects.postgresql.org/example.html
Slide 68: J-3 Query Your Queries pagila=# d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables" Column | Type | ++ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | finding slow queries: idx_tup_fetch | bigint | n_tup_ins | bigint | pg_stat_all_tables n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |
Slide 69: J-4 Query Your Queries pagila=# d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables" Column | Type | ++ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | finding slow queries: idx_tup_fetch | bigint | n_tup_ins | bigint | pg_stat_all_tables n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |
Slide 70: J-5 Query Your Queries pagila=# d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables" Column | Type | ++ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | finding slow queries: idx_tup_fetch | bigint | n_tup_ins | bigint | pg_stat_all_tables n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |
Slide 71: J-6 Query Your Queries pagila=# d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables" Column | Type | ++ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | finding slow queries: idx_tup_fetch | bigint | n_tup_ins | bigint | pg_stat_all_tables n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |
Slide 72: J-7 Query Your Queries pagila=# d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables" Column | Type | ++ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | finding slow queries: idx_tup_fetch | bigint | n_tup_ins | bigint | pg_stat_all_tables n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |
Slide 73: J-8 Query Your Queries pagila=# d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables" Column | Type | ++ relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | finding slow queries: idx_tup_fetch | bigint | n_tup_ins | bigint | pg_stat_all_tables n_tup_upd | bigint | n_tup_del | bigint | n_tup_hot_upd | bigint | n_live_tup | bigint | n_dead_tup | bigint | last_vacuum | timestamptz | last_autovacuum | timestamptz | last_analyze | timestamptz | last_autoanalyze | timestamptz |
Slide 74: J-9 Query Your Queries finding slow queries: pg_stat_all_indexes pagila=# d pg_stat_all_indexes View "pg_catalog.pg_stat_all_indexes" Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |
Slide 75: J-10 Query Your Queries finding slow queries: pg_stat_all_indexes pagila=# d pg_stat_all_indexes View "pg_catalog.pg_stat_all_indexes" Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |
Slide 76: J-11 Query Your Queries finding slow queries: pg_statio_all_tables pagila=# d pg_statio_all_tables View "pg_catalog.pg_statio_all_tables" Column | Type | -----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |
Slide 77: J-12 Query Your Queries finding slow queries: pg_statio_all_tables pagila=# d pg_statio_all_tables View "pg_catalog.pg_statio_all_tables" Column | Type | -----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |
Slide 78: J-13 Query Your Queries fixing slow queries: explain analyze universal tool good for specific queries “explain” for large queries could be it's own talk
Slide 79: J-14 Query Your Queries fixing slow queries: explain analyze universal tool good for specific queries “explain” for large queries could be it's own talk http://wiki.postgresql.org/Using_EXPLAIN
Slide 80: J-15 Query Your Queries fixing slow queries: indexing (basic) use explain to find large sequential reads use pg_stat_* tables to find numerous reads btree – (gist/gin) enable_indexscan, enable_bitmapscan dual column vs. single column
Slide 81: J-16 Query Your Queries fixing slow queries: indexing (partial) restrain index to rows that matter can give significant speed improvements where clause of index should match where clause of query create index address_ba_part_idx on address (district) where district = 'Buenos Aires';
Slide 82: J-17 Query Your Queries fixing slow queries: indexing (partial) restrain index to rows that matter can give significant speed improvements where clause of index should match where clause of query create index customer_active_part_idx on customer (customer_id) where activebool is true;
Slide 83: J-18 Query Your Queries fixing slow queries: indexing (functional) some people prefer to call these expressional indexes
Slide 84: J-19 Query Your Queries fixing slow queries: indexing (expressional) push expensive functions into your index system sees just WHERE indexedcolumn = 'constant' expression of index should match expression of queries narrow scope, but nice gains create unique index one_true_email_xidx on customer (lower(email));
Slide 85: J-20 Query Your Queries fixing slow queries: indexing (expressional) push expensive functions into your index system sees just WHERE indexedcolumn = 'constant' expression of index should match expression of queries narrow scope, but nice gains create index fullname_xidx on customer ((first_name||' '||last_name));
Slide 86: J-21 Query Your Queries fixing slow queries: full text search uses lexmes and word stemming to find common words replacement for LIKE '%x%', ~* 'x'; supports multiple languages, custom dictionaries special indexing options
Slide 87: J-22 Indexing Options full text indexing gist vs. gin “old school” new in 8.2 slower for queries faster for queries faster insert / update slower insert / update mature stable
Slide 88: N-0 PostgreSQL Tablespaces
Slide 89: N-1 PostgreSQL Tablespaces tablespaces? define logical locations for object placement point to locations on disk (uses symlinks) size determined by disk size (not preordained) dedicate per db, split db across multiple tblspc
Slide 90: N-2 PostgreSQL Tablespaces tablespaces! split database over separate disks use stat, statio tables to gauge disk access create dedicated storage for workloads disk for read / write disk for read only large, slow disk for archiving disk for indexes
Slide 91: Q-0 PostgreSQL Partitioning
Slide 92: Q-1 PostgreSQL Partitioning partitioning? as table size grows, it becomes unmanageable use inheritance, rules, constraints to split data queries ignore nonrelevant partitions could be it's own talk
Slide 93: Q-2 PostgreSQL Partitioning partitioning! as table size grows, it becomes unmanageable use inheritance, rules, constraints to split data queries ignore nonrelevant partitions could be it's own talk http://www.pgcon.org/2007/schedule/events/41.en.html
Slide 94: Q-3 PostgreSQL Partitioning partitioning : key points determine list vs. range use triggers rather than rules partition creation vs. data population automate maintenance
Slide 95: I-0 Other Stuff I Should Mention
Slide 96: I-1 Other Stuff I Should Mention pgcrypto cryptography type functions /contrib (export issues) md5, sha1, blowfish, many more
Slide 97: I-2 Other Stuff I Should Mention dblink pg > pg connections /contrib (still under development?) can have performance issues on large queries make it live in it's own schema
Slide 98: I-3 Other Stuff I Should Mention *link heterogenous connections for postgresql db specific and db independent options any pl/u language can implement this similar performance issues to dblink dblinktds, dbilink, oralink, odbclink http://www.pgfoundry.org/ (db link)
Slide 99: I-4 Other Stuff I Should Mention autonomous logging tool persistent logging for postgresql functions built on top of dblink make it live in it's own schema https://labs.omniti.com/trac/pgsoltools
Slide 100: I-5 Other Stuff I Should Mention snapshot pitr clones full read/write copy of pitr slave static snapshot need solaris (zfs zone mojo) could reimplement on other systems https://labs.omniti.com/trac/pgsoltools
Slide 101: I-6 Other Stuff I Should Mention check_postgres nagios based monitoring script common items for warnings and alerts can be adapted to other uses http://bucardo.org/check_postgres
Slide 102: I-7 Other Stuff I Should Mention reconnoiter monitoring / graphing tool postgres based still pretty green https://labs.omniti.com/trac/reconnoiter
Slide 103: I-8 Other Stuff I Should Mention phpPgAdmin web based gui for postgresql remote administration of multiple servers implements much of postgresql functionality support back to 7.2? http://phppgadmin.sourceforge.net/
Slide 104: I-9 Other Stuff I Should Mention ;) my book?
Slide 105: I-10 Other Stuff I Should Mention ;) we're hiring Ops Ninjas Perl KungFu Artists PHP Ninjas Database Samurai http://omniti.com/is/hiring
Slide 106: L-0 El Fin




Add a comment on Slide 1
If you have a SlideShare account, login to comment; else you can comment as a guest- Favorites & Groups
Showing 1-50 of 2 (more)