昨日,在項目中,有人在判斷Youtube下的某個視頻是否能夠播放,寫下了如下的代碼:
$request_url = 'http://youtube.com/get_video_info?video_id=' . $vid . '&el=vevo&fmt=18&asv=2&hd=1';$ch = curl_init();curl_setopt($ch, CURLOPT_URL, $request_url);curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);curl_setopt($ch, CURLOPT_FOLLOWLOCATION, True);$data = curl_exec($ch);curl_close($ch);unset($ch);if ($data === false) {$this->deleteVideo($video['id']);echo "節目: " . $vid . " 節目id" . $video['id'] . "無效,刪除.\n";continue;}parse_str($data, $details);unset($data);if (!isset($details['url_encoded_fmt_stream_map']) || empty($details['url_encoded_fmt_stream_map'])) {$this->deleteVideo($video['id']);echo "節目: " . $vid . " 節目id" . $video['id'] . "無效,刪除.\n";continue;}$newstr = explode(",", $details['url_encoded_fmt_stream_map']);$str1 = explode("&", $newstr[0]);$obj = array();for ($j = 0; $j < count($str1); $j++) {$str2 = explode("=", $str1[$j]);if (!isset($obj[$str2[0]])) {$obj[$str2[0]] = $str2[1];}}$ary_re['source_url'] = urldecode($obj['url']);if (empty($ary_re['source_url']) || !isset($ary_re['source_url'])) {$this->deleteVideo($video['id']);echo "節目: " . $vid . " 節目id" . $video['id'] . "無效,刪除.\n";continue;} else {$bool = $this->getMobileCurl($ary_re['source_url']); //不能夠播放if ($bool === false) {$this->deleteVideo($video['id']);echo "節目: " . $vid . " 節目id" . $video['id'] . "無效,刪除.\n";continue;}}$right_nums++;
getMobileCurl的函數定義如下:
$ch = curl_init();curl_setopt($ch, CURLOPT_URL, $request_url);curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);curl_setopt($ch, CURLOPT_FOLLOWLOCATION, True);$data = curl_exec($ch);curl_close($ch);unset($ch);if ($data === false) {return false }
使用 curl_exec判斷當前連接能否播放,這樣就出現了一個問題:
提示內存溢出!
我專門調試了一下,發現$data的數據占的內存很大,打開那個$request_url,也就是Youtube的視頻的實際播放地址,才發現,視頻是高清1080P,而且是有1個多小時的長度,這才發現問題的原因是$data這是是視頻的內容,所以會溢出,
果斷將getMobileCurl的函數處理成獲取服務器的響應頭信息:
function getMobileCurl($url){$res = get_headers($url, 1);$response_status = $res[0];if (strpos($response_status, "200") !== false || strpos($response_status, "302") !== false || strpos($response_status, "301") !== false) {return true;} else {return false;}}
curl get請求、問題解決:
在使用Curl函數進行爬蟲處理的時候,要注意目標是否是個視頻,文件等比較大的目標,以及自己的需求。