Slideshare.net (beta)

 

All comments

Add a comment on Slide 1

If you have a SlideShare account, login to comment; else you can comment as a guest


Showing 1-50 of 2 (more)

Pro PostgreSQL, OSCon 2008

From xzilla, 1 month ago

My Pro PostgreSQL talk, given at OSCon 2008

2108 views  |  0 comments  |  2 favorites  |  62 downloads  |  1 embed (Stats)
 

Tags

postgres postgresql database omniti availability failover install upgrade community oscon2008

more

 
 

Groups / Events

 

 
Embed
options

More Info

This slideshow is Public
Total Views: 2108
on Slideshare: 2106
from embeds: 2

Slideshow transcript

Slide 1: Pro PostgreSQL Robert Treat omniti.com brighterlamp.org

Slide 2: O-0 Who Am I?  (Why Listen To Me) PostgreSQL User Since 6.5.x DBA of High Traffic / Large PostgreSQL Instances Long Time Contributor to PostgreSQL Project Contribute / Maintain Several Open Source Projects Co-Author Beginning PHP & PostgreSQL 8 (Apress)

Slide 3: O-1 Outline What you need to know Availability about the project Scalability Getting started Query tuning Upgrading Tablespaces Configuring your server Partitioning Hardware Stuff you should know about

Slide 4: K-0 Know Your Way Around The Project

Slide 5: K-1 Know Your Way Around The Project www.postgresql.org  downloads  security alerts  documentation  wiki  bug reports  support companies  rss –> news – events ­ versions

Slide 6: K-2 Know Your Way Around The Project www.pgfoundry.org projects.postgresql.org  URI Type  CIText  Modules  SkyTools   Programs  Npgsql  Pl/Proxy  Resources  pg_bulkload  plpgsql­debugger  sample databases

Slide 7: K-3 Know Your Way Around The Project www.planetpostgresql.org  Project News   Community News  Helpful Tips / Examples

Slide 8: K-4 Know Your Way Around The Project archives.postgresql.org  mailing list archives back to 1997  full text search via postgtresql 8.3  keyword search suggestions  lists for users, developers, regional, user groups

Slide 9: K-5 Know Your Way Around The Project #postgresql  irc.freenode.net  real time help  rtfm_please ­ ??help

Slide 10: K-6 Know Your Way Around The Project project management  core team  committers  ­hackers  roadmap   web team

Slide 11: S-0 Get Off To A Good Start

Slide 12: S-1 Get Off To A Good Start Use package management Consistent Standardized Simple

Slide 13: S-2 Get Off To A Good Start Use package management Different across systems Upgrades are an issue Trust your packager?

Slide 14: S-2 Get Off To A Good Start Use package management Different across systems Upgrades are an issue Trust your packager? Don't Be Afraid To Roll Your Own

Slide 15: S-4 Get Off To A Good Start Configure Logging  $PGDATA/pg_log  /var/log/pgsql  when in doubt...  (postgresql.conf)  separate disk  Logging is often overlooked, but is the first step toward troubleshooting!

Slide 16: S-5 Get Off To A Good Start Configure Authentication most systems have different defaults firewalls/ selinux (FATAL) rtfm (pg_hba.conf, grant, revoke)

Slide 17: S-6 Get Off To A Good Start Authentication Methods TRUST md5 IDENT

Slide 18: S-7 Get Off To A Good Start /contrib trust these more than your own code package dependent use different schemas (when able) pgcrypto pgstatstuple, pg_buffercache, pg_freespacemap

Slide 19: S-8 Get Off To A Good Start procedural languages  package dependent  some are non­core     (plruby, plr, plphp)  varying functionality  varying levels of trust  don't be afraid, test!

Slide 20: U-0 Let's Talk About Upgrades

Slide 21: U-1 Let's Talk About Upgrades Versioning First Digit (7.4.16 ­> 8.2.0) Second Digit (8.2.4 ­> 8.3.0) Third Digit (8.3.0 ­> 8.3.1)

Slide 22: U-2 Let's Talk About Upgrades Versioning First Digit (7.4.16 ­> 8.2.0) Second Digit (8.2.4 ­> 8.3.0) Third Digit (8.3.0 ­> 8.3.1)

Slide 23: U-3 Let's Talk About Upgrades Versioning First Digit (7.4.16 ­> 8.2.0) Second Digit (8.2.4 ­> 8.3.0) Third Digit (8.3.0 ­> 8.3.1)

Slide 24: U-4 Let's Talk About Upgrades Achtung!! Make Backups! Read the Release Notes!

Slide 25: U-5 Let's Talk About Upgrades pg_dump/pg_restore  simple   ­Fc is your friend  dump with new version of  pg_dump  pitfalls (time, hdd)

Slide 26: U-6 Let's Talk About Upgrades the slony method  not simple  create slave on new version  switchover (switch back?)  pitfalls (initial synch, compatibility)

Slide 27: U-7 Let's Talk About Upgrades pg_migrator  in place upgrades  rewrites system catalog info  no way to go back (fs snapshots)  still new, under­flux  8.1 ­> 8.2 only (for now)

Slide 28: U-8 Let's Talk About Upgrades upgrading older db  <= 7.3 is no longer supported (upgrade now!)  pg_dump 8.2 has issues with <= 7.2   you can upgrade to 7.3 first  use adddepends on 7.3 install   slony requires 7.3 (or 7.4) (or 8.3) or newer  pg_migrator (lol)

Slide 29: C-0 Figure Your Configure 

Slide 30: C-1 Figure Your Configure the basics : performance  effective_cache_size  shared_buffers  default_statistics_target  work_mem  checkpoint_segments 

Slide 31: C-2 Figure Your Configure the basics : logging  stderr/pg_log vs. syslog/eventlog  log_min_error_statement (error!)  log_min_duration_statement  log_line_prefix (%d, %p, %t)

Slide 32: C-3 Figure Your Configure other stuff worth looking at  maintenance_work_mem  max_prepared_transactions  update_process_title  max_fsm_pages  synchronous_commit

Slide 33: M-0 Routine Maintenance

Slide 34: M-1 Routine Maintenance a word about vacuum  reclaim usable space  update table stats  avoid xid wraparound

Slide 35: M-2 Routine Maintenance autovacuum : just do it!  autovacuum  track_activities  track_counts  autovacuum_max_freeze_age  pg_autovacuum ?

Slide 36: M-3 Routine Maintenance other stuff worth looking at  reindexing  logfiles  backups  failover

Slide 37: P-0 Hardware For Software People

Slide 38: P-1 Hardware For Software People you can be cheap on the licensing,  don't be cheap on the hardware  Cram the (ecc) RAM  Redundant Disks   (Software|Hardware) RAID  Prefer SCSI, Accept SATA  RAID Z (JBOD)

Slide 39: P-2 Hardware For Software People cpu and you  Multiple CPU's work wonders,      up to 8 processors (16 OK)  Opteron, Xeon, Core 2 Duo   http://tweakers.net/

Slide 40: P-3 Hardware For Software People disk do's  Put WAL on it's own disk (RAID 1)  Put DATA directory on it's own disk (RAID 10)  More Spindles is Good   More Controllers Even Gooder (tablespaces!)  Battery Backed Controllers, Write cache enabled

Slide 41: P-4 Hardware For Software People disk dont's  NFS == Not Fully Safe!   RAID 5  Beware disks / controllers that Lie

Slide 42: A-0 Availability (Not Scalability)

Slide 43: A-1 Availability what do we mean by availability?  not backups (exactly)  not replication (necessarily)  not clustering (even less so)

Slide 44: A-2 Availability what do we mean by availability?  not backups (exactly)  not replication (necessarily)  not clustering (even less so) if (kablooy) then  (ok)

Slide 45: A-3 Availability pg_dump  traditionally used for backups   send dump to another server  constantly run restore process  large time, i/o constraints 

Slide 46: A-4 Availability filesystem snapshots  zfs, lvm  built on top of pitr backup commands  traditionally used for backups   copy snapshot to another server  cumbersome time, i/o constraints  

Slide 47: A-5 Availability pitr  create second, standby server  ship wal logs to new server  less time/io than pg_dump  8.1 ­> cold standby  8.2 ­> warm standby  8.4 ­> hot standby ?

Slide 48: A-6 Availability slony  asynchronous, master­slave replication  controlled switchover, failover  low i/o, time constraints  other benefits (upgrades, scaling)

Slide 49: A-7 Availability bucardo  asynchronous, multi­master replication  also does master­slave  low i/o, time constraints  other benefits (upgrades, scaling)

Slide 50: A-8 Availability shared disk  one copy of PGDATA on shared storage  standby takes over akin to db crash  shared disk is point of failure (raid)  STONITH

Slide 51: A-9 Availability filesystem replication  drbd  filesystem mirrored between servers  synchronized, ordered writes  single disk system? 

Slide 52: A-10 Availability pgpool  dual­master, statement based  little caveats (random(),now(),sequences)  bigger caveats (security, password, pg_hba)  pgpool becomes failure point

Slide 53: A-11 Availability postgres­r  multi­master, synchronous  just open sourced this month!   small community   not proven

Slide 54: H-0 Scalability

Slide 55: H-1 Scalability what is scaling? “How well a solution to some  problem will work when the  relative size of the problem  increases” ­ Theo Schlossnagle 

Slide 56: H-2 Scalability bigger, better, faster, more!  postgresql scales up pretty well  more disks (tablespaces)  more cpu's, more ram  connection pooling   1000+ connections, TB+ data

Slide 57: H-3 Scalability pgpool  dual­master, statement based  little caveats (random(),now(),sequences)  bigger caveats (security, password, pg_hba)  pgpool becomes failure point

Slide 58: H-4 Scalability pg_bouncer  simple connection pooler  10/1 ­> 40/1   caveats (prepared statements, temp tables)  skype, myyearbook.com

Slide 59: H-5 Scalability slony  asynchronous, master­slave replication  multiple, cascading slaves  scales read operations  other benefits (upgrades, scaling)  solid user base

Slide 60: H-6 Scalability bucardo  asynchronous, multi­master replication  also does master­slave  low i/o, time constraints  other benefits (upgrades, scaling)

Slide 61: H-7 Scalability pgpool­II  single db over multiple machines  scales read operations  replication, load balance, parallel query  green technology

Slide 62: H-8 Scalability pgcluster  synchronous multi­master replication  significant complexity  scales read operations  other uses (failover abilities)  green technology

Slide 63: H-9 Scalability postgres­r  multi­master, synchronous  just open source this month!  small community  other uses (failover abilities)  not proven

Slide 64: H-10 Scalability pitr read­only slaves  based on pitr, warm standby operation  core team officially supporting development  8.4 ­> synchronous wal shipping  8.? ­> read only slaves 

Slide 65: J-0 Query Your Queries

Slide 66: J-1 Query Your Queries finding slow queries: log_min_duration_statement  ­1, 0 , n   superuser only  alter user LOG: duration: 5005.273 ms statement: select pg_sleep(5);

Slide 67: J-2 Query Your Queries finding slow queries: pgfouine / pqa  log analyzers  command line, generate reports  i/o load http://pgfouine.projects.postgresql.org/reports.html http://pqa.projects.postgresql.org/example.html

Slide 68: J-3 Query Your Queries pagila=# d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables"       Column      |    Type     | ­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­­­+  relid            | oid         |  schemaname       | name        |  relname          | name        |  seq_scan         | bigint      |  seq_tup_read     | bigint      |  idx_scan         | bigint      | finding slow queries:  idx_tup_fetch    | bigint      |  n_tup_ins        | bigint      | pg_stat_all_tables  n_tup_upd        | bigint      |  n_tup_del        | bigint      |  n_tup_hot_upd    | bigint      |  n_live_tup       | bigint      |  n_dead_tup       | bigint      |  last_vacuum      | timestamptz |  last_autovacuum  | timestamptz |  last_analyze     | timestamptz |  last_autoanalyze | timestamptz |

Slide 69: J-4 Query Your Queries pagila=# d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables"       Column      |    Type     | ­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­­­+  relid            | oid         |  schemaname       | name        |  relname          | name        |  seq_scan         | bigint      |  seq_tup_read     | bigint      |  idx_scan         | bigint      | finding slow queries:  idx_tup_fetch    | bigint      |  n_tup_ins        | bigint      | pg_stat_all_tables  n_tup_upd        | bigint      |  n_tup_del        | bigint      |  n_tup_hot_upd    | bigint      |  n_live_tup       | bigint      |  n_dead_tup       | bigint      |  last_vacuum      | timestamptz |  last_autovacuum  | timestamptz |  last_analyze     | timestamptz |  last_autoanalyze | timestamptz |

Slide 70: J-5 Query Your Queries pagila=# d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables"       Column      |    Type     | ­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­­­+  relid            | oid         |  schemaname       | name        |  relname          | name        |  seq_scan         | bigint      |  seq_tup_read     | bigint      |  idx_scan         | bigint      | finding slow queries:  idx_tup_fetch    | bigint      |  n_tup_ins        | bigint      | pg_stat_all_tables  n_tup_upd        | bigint      |  n_tup_del        | bigint      |  n_tup_hot_upd    | bigint      |  n_live_tup       | bigint      |  n_dead_tup       | bigint      |  last_vacuum      | timestamptz |  last_autovacuum  | timestamptz |  last_analyze     | timestamptz |  last_autoanalyze | timestamptz |

Slide 71: J-6 Query Your Queries pagila=# d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables"       Column      |    Type     | ­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­­­+  relid            | oid         |  schemaname       | name        |  relname          | name        |  seq_scan         | bigint      |  seq_tup_read     | bigint      |  idx_scan         | bigint      | finding slow queries:  idx_tup_fetch    | bigint      |  n_tup_ins        | bigint      | pg_stat_all_tables  n_tup_upd        | bigint      |  n_tup_del        | bigint      |  n_tup_hot_upd    | bigint      |  n_live_tup       | bigint      |  n_dead_tup       | bigint      |  last_vacuum      | timestamptz |  last_autovacuum  | timestamptz |  last_analyze     | timestamptz |  last_autoanalyze | timestamptz |

Slide 72: J-7 Query Your Queries pagila=# d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables"       Column      |    Type     | ­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­­­+  relid            | oid         |  schemaname       | name        |  relname          | name        |  seq_scan         | bigint      |  seq_tup_read     | bigint      |  idx_scan         | bigint      | finding slow queries:  idx_tup_fetch    | bigint      |  n_tup_ins        | bigint      | pg_stat_all_tables  n_tup_upd        | bigint      |  n_tup_del        | bigint      |  n_tup_hot_upd    | bigint      |  n_live_tup       | bigint      |  n_dead_tup       | bigint      |  last_vacuum      | timestamptz |  last_autovacuum  | timestamptz |  last_analyze     | timestamptz |  last_autoanalyze | timestamptz |

Slide 73: J-8 Query Your Queries pagila=# d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables"       Column      |    Type     | ­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­­­+  relid            | oid         |  schemaname       | name        |  relname          | name        |  seq_scan         | bigint      |  seq_tup_read     | bigint      |  idx_scan         | bigint      | finding slow queries:  idx_tup_fetch    | bigint      |  n_tup_ins        | bigint      | pg_stat_all_tables  n_tup_upd        | bigint      |  n_tup_del        | bigint      |  n_tup_hot_upd    | bigint      |  n_live_tup       | bigint      |  n_dead_tup       | bigint      |  last_vacuum      | timestamptz |  last_autovacuum  | timestamptz |  last_analyze     | timestamptz |  last_autoanalyze | timestamptz |

Slide 74: J-9 Query Your Queries finding slow queries: pg_stat_all_indexes pagila=# d pg_stat_all_indexes View "pg_catalog.pg_stat_all_indexes" Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |

Slide 75: J-10 Query Your Queries finding slow queries: pg_stat_all_indexes pagila=# d pg_stat_all_indexes View "pg_catalog.pg_stat_all_indexes" Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |

Slide 76: J-11 Query Your Queries finding slow queries: pg_statio_all_tables pagila=# d pg_statio_all_tables View "pg_catalog.pg_statio_all_tables" Column | Type | -----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |

Slide 77: J-12 Query Your Queries finding slow queries: pg_statio_all_tables pagila=# d pg_statio_all_tables View "pg_catalog.pg_statio_all_tables" Column | Type | -----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |

Slide 78: J-13 Query Your Queries fixing slow queries: explain analyze  universal tool  good for specific queries  “explain” for large queries  could be it's own talk

Slide 79: J-14 Query Your Queries fixing slow queries: explain analyze  universal tool  good for specific queries  “explain” for large queries  could be it's own talk http://wiki.postgresql.org/Using_EXPLAIN

Slide 80: J-15 Query Your Queries fixing slow queries: indexing (basic)  use explain to find large sequential reads  use pg_stat_* tables to find numerous reads  btree – (gist/gin)  enable_indexscan, enable_bitmapscan  dual column vs. single column

Slide 81: J-16 Query Your Queries fixing slow queries: indexing (partial)  restrain index to rows that matter  can give significant speed improvements  where clause of index should match  where clause of query create index address_ba_part_idx on address (district) where district = 'Buenos Aires';

Slide 82: J-17 Query Your Queries fixing slow queries: indexing (partial)  restrain index to rows that matter  can give significant speed improvements  where clause of index should match  where clause of query create index customer_active_part_idx on customer (customer_id) where activebool is true;

Slide 83: J-18 Query Your Queries fixing slow queries: indexing (functional)  some people prefer to call these expressional indexes

Slide 84: J-19 Query Your Queries fixing slow queries: indexing (expressional)  push expensive functions into your index  system sees  just WHERE indexedcolumn = 'constant'  expression of index should match expression of queries  narrow scope, but nice gains create unique index one_true_email_xidx on customer (lower(email));

Slide 85: J-20 Query Your Queries fixing slow queries: indexing (expressional)  push expensive functions into your index  system sees  just WHERE indexedcolumn = 'constant'  expression of index should match expression of queries  narrow scope, but nice gains create index fullname_xidx on customer ((first_name||' '||last_name));

Slide 86: J-21 Query Your Queries fixing slow queries: full text search  uses lexmes and word stemming to find common words  replacement for LIKE '%x%', ~* 'x';  supports multiple languages, custom dictionaries  special indexing options

Slide 87: J-22 Indexing Options full text indexing gist vs. gin “old school” new in 8.2 slower for queries faster for queries faster insert / update slower insert / update mature stable

Slide 88: N-0 PostgreSQL Tablespaces

Slide 89: N-1 PostgreSQL Tablespaces tablespaces?  define logical locations for object placement  point to locations on disk (uses symlinks)  size determined by disk size (not pre­ordained)  dedicate per db, split db across multiple tblspc 

Slide 90: N-2 PostgreSQL Tablespaces tablespaces!  split database over separate disks  use stat, statio tables to gauge disk access  create dedicated storage for workloads  disk for read / write  disk for read only  large, slow disk for archiving  disk for indexes 

Slide 91: Q-0 PostgreSQL Partitioning

Slide 92: Q-1 PostgreSQL Partitioning partitioning?  as table size grows, it becomes unmanageable  use inheritance, rules, constraints to split data  queries ignore non­relevant partitions  could be it's own talk

Slide 93: Q-2 PostgreSQL Partitioning partitioning!  as table size grows, it becomes unmanageable  use inheritance, rules, constraints to split data  queries ignore non­relevant partitions  could be it's own talk http://www.pgcon.org/2007/schedule/events/41.en.html

Slide 94: Q-3 PostgreSQL Partitioning partitioning : key points  determine list vs. range  use triggers rather than rules  partition creation vs. data population   automate maintenance

Slide 95: I-0 Other Stuff I Should Mention

Slide 96: I-1 Other Stuff I Should Mention pgcrypto  cryptography type functions  /contrib (export issues)   md5, sha1, blowfish, many more

Slide 97: I-2 Other Stuff I Should Mention dblink  pg ­> pg connections  /contrib (still under development?)  can have performance issues on large queries  make it live in it's own schema

Slide 98: I-3 Other Stuff I Should Mention *­link  heterogenous connections for postgresql  db specific and db independent options  any pl/u language can implement this   similar performance issues to dblink  dblink­tds, dbi­link, oralink, odbclink http://www.pgfoundry.org/ (db link)

Slide 99: I-4 Other Stuff I Should Mention autonomous logging tool  persistent logging for postgresql functions  built on top of dblink   make it live in it's own schema https://labs.omniti.com/trac/pgsoltools

Slide 100: I-5 Other Stuff I Should Mention snapshot pitr clones  full read/write copy of pitr slave  static snapshot  need solaris (zfs zone mojo)  could re­implement on other systems https://labs.omniti.com/trac/pgsoltools

Slide 101: I-6 Other Stuff I Should Mention check_postgres  nagios based monitoring script  common items for warnings and alerts  can be adapted to other uses http://bucardo.org/check_postgres

Slide 102: I-7 Other Stuff I Should Mention reconnoiter  monitoring / graphing tool  postgres based  still pretty green https://labs.omniti.com/trac/reconnoiter

Slide 103: I-8 Other Stuff I Should Mention phpPgAdmin  web based gui for postgresql  remote administration of multiple servers  implements much of postgresql functionality  support back to 7.2? http://phppgadmin.sourceforge.net/

Slide 104: I-9 Other Stuff I Should Mention  ;­) my book?

Slide 105: I-10 Other Stuff I Should Mention  ;­) we're hiring  Ops Ninjas  Perl Kung­Fu Artists  PHP Ninjas  Database Samurai http://omniti.com/is/hiring

Slide 106: L-0 El Fin