完美世界国际版下载,千年殇,我欲封天txt下载

新聞中心

這里有您想知道的互聯(lián)網(wǎng)營銷解決方案

調(diào)試Go中奇怪的http.ResponseRead行為

先介紹一下背景知識。

創(chuàng)新互聯(lián)建站成立于2013年，是專業(yè)互聯(lián)網(wǎng)技術(shù)服務(wù)公司，擁有項目做網(wǎng)站、成都做網(wǎng)站網(wǎng)站策劃，項目實施與項目整合能力。我們以讓每一個夢想脫穎而出為使命，1280元冊亨做網(wǎng)站,已為上家服務(wù),為冊亨各地企業(yè)和個人服務(wù),聯(lián)系電話:13518219792

使用Dolt[1]，你可以push和pull本地 MySQL 兼容的數(shù)據(jù)庫到遠(yuǎn)程。遠(yuǎn)程可以使用 dolt remoteCLI 命令進(jìn)行管理，它支持多種類型的 remotes[2]。你可以將單獨的目錄用作 Dolt 遠(yuǎn)程、s3 存儲桶或任何實現(xiàn)ChunkStoreService protocol buffer 定義的 grpc 服務(wù)。remotesrv是 Dolt 的開源實現(xiàn)ChunkStoreService。它還提供一個簡單的 HTTP 文件服務(wù)器，用于在遠(yuǎn)程和客戶端之間傳輸數(shù)據(jù)。

本周早些時候，我們遇到了一個與 Dolt CLI 和 remotesrv HTTP 文件服務(wù)器之間的交互相關(guān)的有趣問題。為了解決這個問題，需要了解HTTP/1.1協(xié)議并深入挖掘 Golang 源代碼。在這篇博客中，我們將討論 Golang 的net/http包如何自動設(shè)置Transfer-EncodingHTTP 響應(yīng)的標(biāo)頭以及如何改變http.Response.Body Read客戶端調(diào)用的行為。

一個奇怪的 Dolt CLI 錯誤

這項調(diào)查是從 Dolt 用戶的報告開始的。他們已經(jīng)設(shè)置 remotesrv好托管他們的 Dolt 數(shù)據(jù)庫，并使用 Dolt CLI 將pull 更改上傳到本地克隆。雖然push工作得很好，pull 似乎取得了一些進(jìn)展，但因可疑錯誤而失?。?/p>

throughput below minimum allowable

這個特殊錯誤是可疑的，因為它表明 Dolt 客戶端未能以每秒 1024 字節(jié)的最小速率從remotesrv 的 HTTP 文件服務(wù)器下載數(shù)據(jù)。我們最初的假設(shè)是并行下載會導(dǎo)致下載路徑出現(xiàn)某種擁塞。但不是這樣。研究發(fā)現(xiàn)，此錯誤僅發(fā)生在大型下載中，并且是序列化的，因此不太可能出現(xiàn)擁塞。我們更深入地研究了吞吐量是如何測量的，并發(fā)現(xiàn)了一些令人驚訝的東西。

我們?nèi)绾螠y量吞吐量

’讓我們從 Golang 的io.Reader接口概述開始。該接口允許你將Read來自某個源的字節(jié)并寫入某個緩沖區(qū)b：

func (T) Read(b []byte) (n int, err error)

作為其規(guī)約的一部分，它保證讀取的字節(jié)數(shù)不會超過 len(b) 個字節(jié)，并且讀取b的字節(jié)數(shù)始終以n返回。只要 b足夠大，特定 Read 調(diào)用可以返回 0 個字節(jié)、10 個字節(jié)甚至 134,232,001 個字節(jié)。如果讀取器用完了要讀取的字節(jié)，它會返回一個你可以測試的文件結(jié)束 (EOF) 錯誤。

當(dāng)你使用net/http包在 Golang 中進(jìn)行 HTTP 調(diào)用時，響應(yīng) body 是一個 io.Reader。你可以使用Read讀取 body 上的字節(jié)?？紤]到io.Reader規(guī)約，我們知道，在任何特定調(diào)用Read期間可以檢索從 0 從到整個正文的任何位置。

在我們的研究中，我們發(fā)現(xiàn) 134,232,001 字節(jié)的下載量未能達(dá)到我們的最低吞吐量，但原因并沒有立即顯現(xiàn)。使用Wireshark[3]，我們可以看到數(shù)據(jù)傳輸速度足夠快，而且問題似乎在于 Dolt CLI 如何測量吞吐量。

下面是一些描述如何測量吞吐量的偽代碼：

type measurement struct {
 N int
 T time.Time
}
type throughputReader struct {
 io.Reader
 ms chan measurement
}
func (r throughputReader) Read(bs []byte) (int, error) {
 n, err := r.Reader.Read(bs)
 r.ms <- measurement{n, time.Now()}
 return n, err
}
func ReadNWithMinThroughput(r io.Reader, n int64, min_bps int64) ([]byte, error) {
 ms := make(chan measurement)
 defer close(ms)
 r = throughputReader{r, ms}
 bytes := make([]byte, n)
 go func() {
  for {
   select {
   case _, ok := <-ms:
    if !ok {
     return
    }
    // Add sample to a window of samples.
   case <-time.After(1 * time.Second):
   }
   // Calculate the throughput by selecting a window of samples,
   // summing the sampled bytes read, and dividing by the window length. If the
   // throughput is less than |min_bps|, cancel our context.
  }
 }()
 _, err := io.ReadFull(r, bytes)
 return bytes, err
}
}

上面的代碼揭示了我們問題的罪魁禍?zhǔn)住Ｕ堊⒁?，如果單個Read 調(diào)用需要很長時間，則不會有吞吐量樣本到達(dá)，最終我們的測量代碼將報告吞吐量為 0 字節(jié)并拋出錯誤。小型下載已完成，但較大的下載始終失敗這一事實進(jìn)一步支持了這一點。

但是我們?nèi)绾畏乐惯@些大Reads的以及導(dǎo)致一些讀取量大而另一些讀取量小的原因呢?

讓我們通過剖析 HTTP 響應(yīng)如何在服務(wù)器上構(gòu)建以及客戶端如何解析來研究這一點。

編寫 HTTP 響應(yīng)

在 Golang 中，你用 http.ResponseWriter 向客戶端返回數(shù)據(jù)。你可以使用 writer 來編寫標(biāo)頭和正文，但是有很多底層邏輯可以控制實際寫入的標(biāo)頭以及正文的編碼方式。

例如，在 http 文件服務(wù)器中，我們從不設(shè)置Content-Typeor Transfer-Encoding標(biāo)頭。我們只是調(diào)用一次帶緩沖區(qū)的Write，來保存我們需要返回的數(shù)據(jù)。但是如果我們用 curl 檢查響應(yīng)頭：

=> curl -sSL -D - http://localhost:8080/dolthub/test/53l5... -o /dev/null
HTTP/1.1 200 OK
Date: Wed, 09 Mar 2022 01:21:28 GMT
Content-Type: application/octet-stream
Transfer-Encoding: chunked

我們可以看到Content-Type和Transfer-Encodingheaders 都設(shè)置好了!此外，Transfer-Encoding設(shè)置為chunked!

這是我們從 net/http/server.go[4]找到的一條評論，解釋了這一點：

// The Life Of A Write is like this:
//
// Handler starts. No header has been sent. The handler can either
// write a header, or just start writing. Writing before sending a header
// sends an implicitly empty 200 OK header.
//
// If the handler didn't declare a Content-Length up front, we either
// go into chunking mode or, if the handler finishes running before
// the chunking buffer size, we compute a Content-Length and send that
// in the header instead.
//
// Likewise, if the handler didn't set a Content-Type, we sniff that
// from the initial chunk of output.

這是維基百科[5]對分塊傳輸編碼的解釋：

分塊傳輸編碼是超文本傳輸協(xié)議 (HTTP) 版本 1.1 中可用的流式數(shù)據(jù)傳輸機(jī)制。在分塊傳輸編碼中，數(shù)據(jù)流被分成一系列不重疊的“塊”。這些塊彼此獨立地發(fā)送和接收。在任何給定時間，發(fā)送者和接收者都不需要知道當(dāng)前正在處理的塊之外的數(shù)據(jù)流。

每個塊前面都有其大小(以字節(jié)為單位)。當(dāng)接收到零長度塊時，傳輸結(jié)束。Transfer-Encoding 頭中的 chunked 關(guān)鍵字用于表示分塊傳輸。1994 年提出了一種早期形式的分塊傳輸編碼。[ 1[6] ] HTTP/2 不支持分塊傳輸編碼，它為數(shù)據(jù)流提供了自己的機(jī)制。[ 2[7] ]。

讀取 HTTP 響應(yīng)

要讀取 http 響應(yīng)的正文(body)，net/http 提供的 Response.Body 是一個 io.Reader. 它還具有隱藏 HTTP 實現(xiàn)細(xì)節(jié)的邏輯。無論使用何種傳輸編碼，提供的io.Reader僅返回最初寫入請求中的字節(jié)。它會自動“de-chunks”分塊的響應(yīng)。

我們更詳細(xì)地研究了這種“de-chunks”，以了解為什么這會導(dǎo)致大的Read。

寫和讀塊

如果你看一下chunkedWriter實現(xiàn)，你會發(fā)現(xiàn)每個 Write都會產(chǎn)生一個新的塊，而不管它的大?。?/p>

// Write the contents of data as one chunk to Wire.
func (cw *chunkedWriter) Write(data []byte) (n int, err error) {

 // Don't send 0-length data. It looks like EOF for chunked encoding.
 if len(data) == 0 {
  return 0, nil
 }

 if _, err = fmt.Fprintf(cw.Wire, "%x\r\n", len(data)); err != nil {
  return 0, err
 }
 if n, err = cw.Wire.Write(data); err != nil {
  return
 }
 if n != len(data) {
  err = io.ErrShortWrite
  return
 }
 if _, err = io.WriteString(cw.Wire, "\r\n"); err != nil {
  return
 }
 if bw, ok := cw.Wire.(*FlushAfterChunkWriter); ok {
  err = bw.Flush()
 }
 return
}

在remotesrv中，我們首先將請求的數(shù)據(jù)加載到緩沖區(qū)中，然后調(diào)用 Write一次。所以我們通過網(wǎng)絡(luò)發(fā)送 1 個大塊。

在chunkedReader中我們看到，一次 Read 調(diào)用將讀取來自網(wǎng)絡(luò)的整個塊：

func (cr *chunkedReader) Read(b []uint8) (n int, err error) {
 for cr.err == nil {
  if cr.checkEnd {
   if n > 0 && cr.r.Buffered() < 2 {
    // We have some data. Return early (per the io.Reader
    // contract) instead of potentially blocking while
    // reading more.
    break
   }
   if _, cr.err = io.ReadFull(cr.r, cr.buf[:2]); cr.err == nil {
    if string(cr.buf[:]) != "\r\n" {
     cr.err = errors.New("malformed chunked encoding")
     break
    }
   } else {
    if cr.err == io.EOF {
     cr.err = io.ErrUnexpectedEOF
    }
    break
   }
   cr.checkEnd = false
  }
  if cr.n == 0 {
   if n > 0 && !cr.chunkHeaderAvailable() {
    // We've read enough. Don't potentially block
    // reading a new chunk header.
    break
   }
   cr.beginChunk()
   continue
  }
  if len(b) == 0 {
   break
  }
  rbuf := b
  if uint64(len(rbuf)) > cr.n {
   rbuf = rbuf[:cr.n]
  }
  var n0 int
  /*
  Annotation by Dhruv:
  This Read call directly calls Read on |net.Conn| if |rbuf| is larger
  than the underlying |bufio.Reader|'s buffer size.
  */
  n0, cr.err = cr.r.Read(rbuf)
  n += n0
  b = b[n0:]
  cr.n -= uint64(n0)
  // If we're at the end of a chunk, read the next two
  // bytes to verify they are "\r\n".
  if cr.n == 0 && cr.err == nil {
   cr.checkEnd = true
  } else if cr.err == io.EOF {
   cr.err = io.ErrUnexpectedEOF
  }
 }
 return n, cr.err
}

由于來自我們的 HTTP 文件服務(wù)器的每個請求都作為單個塊提供和讀取，因此Read調(diào)用的返回時間完全取決于請求數(shù)據(jù)的大小。在我們下載大量數(shù)據(jù)(134,232,001 字節(jié))的情況下，這些Read調(diào)用始終超時。

解決問題

我們有兩個候選的解決方案來解決這個問題。我們可以通過分解http.ResponseWriter Write調(diào)用來生成更小的塊，或者我們可以顯式地設(shè)置Content-Length將完全繞過塊傳輸編碼的標(biāo)頭。

我們決定通過使用 io.Copy分解http.ResponseWriter Write。io.Copy產(chǎn)生Write最多 32 * 1024 (32,768) 字節(jié) 。為了使用它，我們重構(gòu)了我們的代碼以為io.Reader提供所需的數(shù)據(jù)而不是大緩沖區(qū)。使用 io.Copy是一種在io.Reader 和io.Writer之間傳遞數(shù)據(jù)的慣用模式。

你可以在此處[8]查看包含這些更改的 PR 。

結(jié)論

總之，我們發(fā)現(xiàn)在寫入響應(yīng)時，如果不設(shè)置 Content-Length并且寫入的大小大于分塊緩沖區(qū)大小，http.ResponseWriter 將使用分塊傳輸編碼。相應(yīng)地，當(dāng)我們讀取響應(yīng)時，chunkReader將嘗試從 net.Conn 讀取整個塊。由于remotesrv編寫了一個非常大的塊，Dolt CLI 上 Read的調(diào)用總是花費太長時間并導(dǎo)致拋出整個錯誤。我們通過編寫更小的塊來解決這個問題。

使用該net/http包和其他 Golang 標(biāo)準(zhǔn)庫很愉快。由于大多數(shù)標(biāo)準(zhǔn)庫都是用 Go 本身編寫的，并且可以在 Github 上查看，因此很容易閱讀源代碼。盡管手頭的具體問題幾乎沒有文檔，但只用了一兩個小時就可以挖掘到根本原因。我個人很高興能繼續(xù)在 Dolt 上工作并加深我對 Go 的了解。

文章名稱：調(diào)試Go中奇怪的http.ResponseRead行為
文章起源：http://m.fisionsoft.com.cn/article/dpecdgj.html