Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

יצירת אודיו באורך מלא

במאמר הזה מוסבר איך לבצע סינתזה של אודיו ארוך. סינתזת אודיו ארוך מסנתזת באופן אסינכרוני עד מיליון בייטים של קלט. מידע נוסף על מושגי היסוד ב-Cloud Text-to-Speech זמין במאמר מושגי יסוד ב-Cloud Text-to-Speech.

לפני שמתחילים

לפני ששולחים בקשה ל-Cloud Text-to-Speech API, צריך לבצע את הפעולות הבאות. פרטים נוספים מופיעים בדף לפני שמתחילים.

מפעילים את Cloud Text-to-Speech בפרויקט ב- Google Cloud .
1. מוודאים שהחיוב מופעל עבור Cloud Text-to-Speech.
2. מוודאים שיש לכם את התפקידים הבאים בניהול הזהויות והרשאות הגישה (IAM) בקטגוריית הפלט Google Cloud .
  - יצירת אובייקטים של אחסון
  - צפייה באובייקטים של אחסון
התקינו את ה-CLI של Google Cloud. אחר כך, אתחלו את ה-CLI של Google Cloud באמצעות הפקודה הבאה:
```
gcloud init
```
אם אתם משתמשים בספק זהויות חיצוני (IdP), קודם אתם צריכים להיכנס ל-CLI של gcloud באמצעות המאגר המאוחד לניהול זהויות.

סינתוז של אודיו ארוך מטקסט באמצעות שורת הפקודה

כדי להמיר טקסט ארוך לאודיו, שולחים בקשת HTTP POST לנקודת הקצה https://texttospeech.googleapis.com/v1beta1/projects/{$project_number}/locations/global:synthesizeLongAudio. בגוף של פקודת ה-POST, מציינים את השדות הבאים.

‫• voice: סוג הקול לסינתזה.

‫• input.text: הטקסט לסינתזה.

‫• audioConfig: סוג האודיו שרוצים ליצור.

‫• output_gcs_uri: נתיב הפלט Google Cloud בפורמט gs://bucket_name/file_name.wav.

‫• parent: ההורה בפורמט projects/{YOUR_PROJECT_NUMBER}/locations/{YOUR_PROJECT_LOCATION}.

הקלט יכול להכיל עד 1MB של תווים, אבל המגבלה המדויקת עשויה להשתנות בהתאם לקלט.

יוצרים Google Cloud קטגוריית אחסון בפרויקט שמשמש להרצת הסינתזה. מוודאים שלחשבון השירות שמשמש להפעלת הסינתזה יש גישת קריאה וכתיבה ל- Google Cloud bucket של הפלט.
מריצים את בקשת ה-REST בשורת הפקודה כדי לבצע סינתזה של האודיו מהטקסט באמצעות Cloud TTS. הפקודה משתמשת בפקודה gcloud auth application-default print-access-token כדי לאחזר טוקן הרשאה לבקשה.

ה-method של ה-HTTP וכתובת ה-URL:
```
POST https://texttospeech.googleapis.com/v1beta1/projects/12345/locations/global:synthesizeLongAudio
```
תוכן בקשת JSON:
```
{
  "parent": "projects/12345/locations/global",
  "audio_config":{
      "audio_encoding":"LINEAR16"
  },
  "input":{
      "text":"hello"
  },
  "voice":{
      "language_code":"en-us",
      "name":"en-us-Standard-A"
  },
  "output_gcs_uri": "gs://bucket_name/file_name.wav"
}
```
כדי לשלוח את הבקשה צריך להרחיב אחת מהאפשרויות הבאות:
‫Curl (Linux,‏ macOS או Cloud Shell)

הערה: הפקודה הבאה מבוססת על ההנחה שנכנסתם ל-CLI של gcloud באמצעות חשבון המשתמש שלכם, על ידי הרצת gcloud init או gcloud auth login, או באמצעות Cloud Shell שמחבר אתכם אוטומטית ל-CLI של gcloud. כדי לבדוק איזה חשבון פעיל, אפשר להריץ את הפקודה gcloud auth list.

שומרים את גוף הבקשה בקובץ בשם request.json ומריצים את הפקודה הבאה:
```
curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://texttospeech.googleapis.com/v1beta1/projects/12345/locations/global:synthesizeLongAudio"
```
‎PowerShell (Windows)

הערה: הפקודה הבאה מבוססת על ההנחה שנכנסתם ל-CLI של gcloud באמצעות חשבון המשתמש שלכם, על ידי הרצת gcloud init או gcloud auth login. כדי לבדוק איזה חשבון פעיל, אפשר להריץ את הפקודה gcloud auth list.

שומרים את גוף הבקשה בקובץ בשם request.json ומריצים את הפקודה הבאה:
```
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://texttospeech.googleapis.com/v1beta1/projects/12345/locations/global:synthesizeLongAudio" | Select-Object -Expand Content
```
אתם אמורים לקבל תגובת JSON שדומה לזו:
```
{
  "name": "23456",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.texttospeech.v1beta1.SynthesizeLongAudioMetadata",
    "progressPercentage": 0,
    "startTime": "2022-12-20T00:46:56.296191037Z",
    "lastUpdateTime": "2022-12-20T00:46:56.296191037Z"
  },
  "done": false
}
```
פלט ה-JSON של פקודת ה-REST מכיל את השם הארוך של הפעולה בשדה name. מריצים את בקשת ה-REST בשורת הפקודה כדי לשלוח שאילתה לגבי מצב הפעולה הממושכת.

מוודאים שחשבון השירות שמריץ את פעולת ה-GET הוא מאותו פרויקט שבו נעשה שימוש לסינתזה.

ה-method של ה-HTTP וכתובת ה-URL:
```
GET https://texttospeech.googleapis.com/v1beta1/projects/12345/locations/global/operations/23456
```
כדי לשלוח את הבקשה צריך להרחיב אחת מהאפשרויות הבאות:
‫Curl (Linux,‏ macOS או Cloud Shell)

הערה: הפקודה הבאה מבוססת על ההנחה שנכנסתם ל-CLI של gcloud באמצעות חשבון המשתמש שלכם, על ידי הרצת gcloud init או gcloud auth login, או באמצעות Cloud Shell שמחבר אתכם אוטומטית ל-CLI של gcloud. כדי לבדוק איזה חשבון פעיל, אפשר להריץ את הפקודה gcloud auth list.

מריצים את הפקודה הבאה:
```
curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://texttospeech.googleapis.com/v1beta1/projects/12345/locations/global/operations/23456"
```
‎PowerShell (Windows)

הערה: הפקודה הבאה מבוססת על ההנחה שנכנסתם ל-CLI של gcloud באמצעות חשבון המשתמש שלכם, על ידי הרצת gcloud init או gcloud auth login. כדי לבדוק איזה חשבון פעיל, אפשר להריץ את הפקודה gcloud auth list.

מריצים את הפקודה הבאה:
```
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://texttospeech.googleapis.com/v1beta1/projects/12345/locations/global/operations/23456" | Select-Object -Expand Content
```
אתם אמורים לקבל תגובת JSON שדומה לזו:
```
{
  "name": "projects/12345/locations/global/operations/23456",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.texttospeech.v1beta1.SynthesizeLongAudioMetadata",
    "progressPercentage": 100
  },
  "done": true
}
```
שולחים שאילתה לרשימה של כל הפעולות שפועלות בפרויקט נתון, ומריצים את בקשת ה-REST.

מוודאים שחשבון השירות שמריץ את פעולת LIST הוא מאותו פרויקט שבו נעשה שימוש לסינתזה.

ה-method של ה-HTTP וכתובת ה-URL:
```
GET https://texttospeech.googleapis.com/v1beta1/projects/12345/locations/global/operations
```
כדי לשלוח את הבקשה צריך להרחיב אחת מהאפשרויות הבאות:
‫Curl (Linux,‏ macOS או Cloud Shell)

הערה: הפקודה הבאה מבוססת על ההנחה שנכנסתם ל-CLI של gcloud באמצעות חשבון המשתמש שלכם, על ידי הרצת gcloud init או gcloud auth login, או באמצעות Cloud Shell שמחבר אתכם אוטומטית ל-CLI של gcloud. כדי לבדוק איזה חשבון פעיל, אפשר להריץ את הפקודה gcloud auth list.

מריצים את הפקודה הבאה:
```
curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     "https://texttospeech.googleapis.com/v1beta1/projects/12345/locations/global/operations"
```
‎PowerShell (Windows)

הערה: הפקודה הבאה מבוססת על ההנחה שנכנסתם ל-CLI של gcloud באמצעות חשבון המשתמש שלכם, על ידי הרצת gcloud init או gcloud auth login. כדי לבדוק איזה חשבון פעיל, אפשר להריץ את הפקודה gcloud auth list.

מריצים את הפקודה הבאה:
```
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://texttospeech.googleapis.com/v1beta1/projects/12345/locations/global/operations" | Select-Object -Expand Content
```
אתם אמורים לקבל תגובת JSON שדומה לזו:
```
{
  "operations": [
    {
      "name": "12345",
      "done": false
    },
    {
      "name": "23456",
      "done": false
    }
  ],
  "nextPageToken": ""
}
```
אחרי שהפעולה הממושכת מסתיימת בהצלחה, מחפשים את קובץ האודיו של הפלט ב-URI של הקטגוריה שצוין בשדה output_gcs_uri. אם הפעולה לא הושלמה בהצלחה, צריך להריץ שאילתה באמצעות פקודת ה-REST של GET כדי למצוא את השגיאה, לתקן אותה ולהנפיק מחדש את ה-RPC.

יצירת אודיו ארוך מטקסט באמצעות ספריות לקוח

כדי לסנתז אודיו ארוך, פועלים לפי ההוראות הבאות.

התקנת ספריית הלקוח

Python

לפני שמתקינים את הספרייה, צריך לוודא שהכנתם את הסביבה לפיתוח בשפת Python.

pip install --upgrade google-cloud-texttospeech

יצירת נתוני אודיו

אתם יכולים להשתמש ב-Cloud TTS כדי ליצור קובץ אודיו ארוך של דיבור אנושי סינתטי. משתמשים בקוד הבא כדי ליצור קובץ אודיו ארוך בדלי Google Cloud .

Python

לפני שמריצים את הדוגמה, חשוב לוודא שהכנתם את הסביבה לפיתוח בשפת Python.

# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from google.cloud import texttospeech


def synthesize_long_audio(project_id: str, output_gcs_uri: str) -> None:
    """
    Synthesizes long input, writing the resulting audio to `output_gcs_uri`.

    Args:
        project_id: ID or number of the Google Cloud project you want to use.
        output_gcs_uri: Specifies a Cloud Storage URI for the synthesis results.
            Must be specified in the format:
            ``gs://bucket_name/object_name``, and the bucket must
            already exist.
    """

    client = texttospeech.TextToSpeechLongAudioSynthesizeClient()

    input = texttospeech.SynthesisInput(
        text="Test input. Replace this with any text you want to synthesize, up to 1 million bytes long!"
    )

    audio_config = texttospeech.AudioConfig(
        audio_encoding=texttospeech.AudioEncoding.LINEAR16
    )

    voice = texttospeech.VoiceSelectionParams(
        language_code="en-US", name="en-US-Standard-A"
    )

    parent = f"projects/{project_id}/locations/us-central1"

    request = texttospeech.SynthesizeLongAudioRequest(
        parent=parent,
        input=input,
        audio_config=audio_config,
        voice=voice,
        output_gcs_uri=output_gcs_uri,
    )

    operation = client.synthesize_long_audio(request=request)
    # Set a deadline for your LRO to finish. 300 seconds is reasonable, but can be adjusted depending on the length of the input.
    # If the operation times out, that likely means there was an error. In that case, inspect the error, and try again.
    result = operation.result(timeout=300)
    print(
        "\nFinished processing, check your GCS bucket to find your audio file! Printing what should be an empty result: ",
        result,
    )

הסרת המשאבים

כדי להימנע מחיובים מיותרים Google Cloud , כדאי להשתמש בGoogle Cloud console כדי למחוק את הפרויקט אם אין בו צורך.

המאמרים הבאים

מידע נוסף על Cloud Text-to-Speech זמין במאמר מושגי יסוד.
אפשר לעיין ברשימה של הקולות הזמינים שאפשר להשתמש בהם לדיבור מסונתז.

יצירת אודיו באורך מלא

לפני שמתחילים

סינתוז של אודיו ארוך מטקסט באמצעות שורת הפקודה

‫Curl (Linux,‏ macOS או Cloud Shell)

‎PowerShell (Windows)

‫Curl (Linux,‏ macOS או Cloud Shell)

‎PowerShell (Windows)

‫Curl (Linux,‏ macOS או Cloud Shell)

‎PowerShell (Windows)

יצירת אודיו ארוך מטקסט באמצעות ספריות לקוח

התקנת ספריית הלקוח

Python

יצירת נתוני אודיו

Python

הסרת המשאבים

המאמרים הבאים